openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
312 stars 92 forks source link

Add property that denotes if a molecule has/may have stereoisomers #739

Open mattwthompson opened 4 years ago

mattwthompson commented 4 years ago

Is your feature request related to a problem? Please describe. For molecules that potentially have stereoisomers, the only way to check if they do is to enumerate them out and see if any were found. This is an sub-optimal pattern:

>>> from openforcefield.topology import Molecule
>>> mol = Molecule.from_smiles("C\C(F)=C(/F)C[C@](C)(Cl)Br")
>>> stereoisomers = mol.enumerate_stereoisomers()
>>> if stereoisomers:
...     for isomol in stereoisomers:
...         pass # do something
...

Describe the solution you'd like It would be nice to separate out the questions of "do I have any stereoisomers?" and "what are my stereisomers?" and behave something like this

>>> from openforcefield.topology import Molecule
>>> mol = Molecule.from_smiles("C\C(F)=C(/F)C[C@](C)(Cl)Br")
>>> if mol.stereo_possible:
...     stereoisomers = mol.enumerate_stereoisomers()
...     for isomol in stereoisomers:
...         pass # do something
...

and therefore skipping the mol.enumerate_stereoisomers() on molecules that have none.

I don't know if this should be a property of the molecule as a whole or on bonds and atoms or on all three. If it can reliably be tracked on a per-atom and per-bond basis, then a molecule-level property could just iterate through those and short-circuit when it finds any?

Describe alternatives you've considered The original snippet isn't too bad. I have no intuition as to the performance penalty with the sometimes-unecessary stereoisomer enumeration (I would guess this is prohibitive with larger molecules and/or datasets, but I don't have hands-on experience there).

Additional context See https://github.com/openforcefield/openff-cli/pull/4#discussion_r498579334 - @j-wags, feel free to edit and/or add to this for clarity.

j-wags commented 4 years ago

I've been coming around to the idea that each atom/bond should have a stereo_possible property that indicates whether it COULD have stereochemistry. OE and RDKit do this and I think it's a smart design.