Electrostatics / apbs

Software for biomolecular electrostatics and solvation calculations
http://www.poissonboltzmann.org/
Other
93 stars 25 forks source link

The ApbsLegacyInput produces invalid dictionary keys #129

Open intendo opened 3 years ago

intendo commented 3 years ago

The dictionaries produced cannot be converted to JSON correctly because they sub-keys are integers instead of strings.

sobolevnrm commented 3 years ago

Is the JSON (in lieu of dictionaries) for communicating via APIs?

intendo commented 3 years ago

The intent is that the input files should be in a form that can be used by as many tools as possible. A valid dictionary structure should be able to be serialized into JSON or another "container" via some loads() method (e.g. json.loads(apbs_config)).

I am trying to figure out a good way to "query" the resulting dictionaries that can be created from the legacy files and your new YAML configuration file format.

It seems like a good idea to create the dictionaries in such a way that they can be "queried" for values and there are several libraries for querying JSON. I feel that if we make our dictionaries standard enough and they can be read/transformed/loaded/dumped/queried with JSON or other tools, then we have lots of options.

For example, assuming the READ section of the input file gets built into a dictionary like the following:

    "READ": {
        "mol": {
            "pqr": [
                {
                    "file": "mol1.pqr"
                },
                {
                    "file": "mol2.pqr"
                },
                {
                    "file": "complex.pqr"
                }
            ]
        }
    }

I have a simple JSON parser that lets you query arbitrary dicts of arrays of dicts, etc. and lets you query the data with something like the following to get all of the PQR files that need to be read:

        config = parse(ApbsLegacyInput.load(filename))
        for idx, filename in enumerate(config.READ.mol.pqr[...].file):
            print(f"FILE idx: {idx}, file: {filename}"

It produces the following output:

FILE idx: 0, file: mol1.pqr
FILE idx: 1, file: mol2.pqr
FILE idx: 2, file: complex.pqr

There is another one that works on nested dicts or JSON at: https://github.com/perfecto25/dictor

Do you have something in mind to use for an access pattern to get data out of the configuration dictionaries?

sobolevnrm commented 3 years ago

It seems like YAML handles this situation and is the format we're heading towards for input files. Is that an option?