OpenChemistry / 42

The answer
0 stars 0 forks source link

Geometry collection, linking to molecule collection #5

Open cryos opened 5 years ago

cryos commented 5 years ago

This collection would always have a parent molecule, and have one geometry, with optional provenance, i.e. from calculation a, generated from InChI using Open Babel, from structure resolver, etc.

cryos commented 5 years ago

To record some of our discussions a geometry would contain a complete CJSON (atoms, bonds, positions), link to a parent molecule, and if a parent molecule is not present one should be created with the same geometry before creating a geometry. Optional link to a calculation that generated the geometry, or basic provenance such as uploaded by mhanwell, generated from InChI, etc.

The geometry from the molecules collection will still be used as the default for a molecule, we should list all geometries for a molecule, offer the optional ability to find/use a specific geometry in workflows.

cryos commented 5 years ago

To address a few new points brought up by @alesgenova:

Also the calculation already has a cjson field

The geometry collection is separate, and will only act as input geometry. It should not duplicate any more than atoms and bonds keys with some links for provenance. The input and output geometry may be the same, that is OK, they may differ for optimizations, etc.

we should probably have an input and output geometry for calculations

It would be good to link a calculation to the parent molecule as we do now, and optionally to a geometry if one was chosen. If one isn't the geometry in the molecule is still used.

How is the geometry going to fit in the current workflow in jupyterlab?

Stays the same in the default case, where we try to determine the best input geometry or generate one. When supplied it should be an optional parameter. I want to improve search in the single page application, and it would be good to think about this within Jupyter.

If a user provides a new geometry are we gonna try and match it to existing geometries?

No, at least not any time soon in my opinion. We can always add scans for equivalent geometry, but between atom reordering, fuzzy coordinate matching, etc I would like to keep this simple. If the user supplies a geometry we accept it and generate a unique ID they can use to refer to it.