OpenChemistry / openchemistrypy

7 stars 5 forks source link

Move more input/output conversion logic into docker containers #28

Open cjh1 opened 5 years ago

cjh1 commented 5 years ago

In order to get a clear separation of concerns we should do more of the code specific format conversion within the containers. They would then just accept a format that we could be produced using avogadro or openbabel, similarly for the output format.

cjh1 commented 5 years ago

One option is to make the containers conform to some CLI specify what options should be provided, such as the input/output format.

cryos commented 5 years ago

My ideal would be a JSON that could be supplied in a setup interface for someone with admin or elevated rights in our web interface. That would contain metadata like the name, description, container URL. It could also contain all the options, but they could be in the CLI of the container as you say.

At a basic level the Docker container should accept JSON with job metadata, theory, basis, etc. It should also accept CJSON, CML, XYZ, SMILES, InChI, etc for the molecular graph/geometry. The output should be mapped to a format that we understand such as the above, various log file formats in Avogadro, etc. Property prediction would likely be metadata output, but QM codes could ideally get mapped.

This interaction sounds quite a bit like the Avogadro input generator script design on steroids. I think this general concept is great to work towards, with the goal that I could even add private containers for my fork of NWChem with a bug fix, additional feature, etc. We would want to tag calculations so that we knew which image was used in that case too, and ideally be able to do a lookup on the metadata.