openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
313 stars 92 forks source link

Error from `Molecule.from_smiles()` #1166

Open csu1505110121 opened 2 years ago

csu1505110121 commented 2 years ago

Dear All,

Describe the bug When I utilize the api Molecule.from_smiles(), the smile I queried is as following, contains several Nitrogen atoms.

SMILES = 'c1c2cc3c(c1)cnc(c3)C=CC(C)(C)C(=O)O[C@@H](C(=O)N[C@@H](C)C(=O)N1N[C@@H](CCC1)C(=O)N[C@@H]2C)C(C)C'

It turns out that "UndefinedStereochemistryError: Unable to make OFFMol from RDMol: Unable to make OFFMol from SMILES: RDMol has unspecified stereochemistry. Undefined chiral centers are:

Is there any idea about this?

To Reproduce

from openforcefield.topology import Molecule
Molecule.from_smiles('c1c2cc3c(c1)cnc(c3)C=CC(C)(C)C(=O)O[C@@H](C(=O)N[C@@H](C)C(=O)N1N[C@@H](CCC1)C(=O)N[C@@H]2C)C(C)C')

Output

UndefinedStereochemistryError Traceback (most recent call last) /tmp/ipykernel_28996/2255973072.py in ----> 1 Molecule.from_smiles('c1c2cc3c(c1)cnc(c3)C=CC(C)(C)C(=O)OC@@HC(C)C')

/opt/anaconda2/envs/openmm_new/lib/python3.8/site-packages/openforcefield/topology/molecule.py in from_smiles(cls, smiles, hydrogens_are_explicit, toolkit_registry, allow_undefined_stereo) 2481 """ 2482 if isinstance(toolkit_registry, ToolkitRegistry): -> 2483 molecule = toolkit_registry.call( 2484 "from_smiles", 2485 smiles,

/opt/anaconda2/envs/openmm_new/lib/python3.8/site-packages/openforcefield/utils/toolkits.py in call(self, method_name, raise_exception_types, *args, **kwargs) 5797 for exception_type in raise_exception_types: 5798 if isinstance(e, exception_type): -> 5799 raise e 5800 errors.append((toolkit, e)) 5801

/opt/anaconda2/envs/openmm_new/lib/python3.8/site-packages/openforcefield/utils/toolkits.py in call(self, method_name, raise_exception_types, *args, *kwargs) 5793 method = getattr(toolkit, method_name) 5794 try: -> 5795 return method(args, **kwargs) 5796 except Exception as e: 5797 for exception_type in raise_exception_types:

/opt/anaconda2/envs/openmm_new/lib/python3.8/site-packages/openforcefield/utils/toolkits.py in from_smiles(self, smiles, hydrogens_are_explicit, allow_undefined_stereo, _cls) 3366 # Throw an exception/warning if there is unspecified stereochemistry. 3367 if allow_undefined_stereo == False: -> 3368 self._detect_undefined_stereo( 3369 rdmol, err_msg_prefix="Unable to make OFFMol from SMILES: " 3370 )

/opt/anaconda2/envs/openmm_new/lib/python3.8/site-packages/openforcefield/utils/toolkits.py in _detect_undefined_stereo(cls, rdmol, err_msg_prefix, raise_warning) 4775 else: 4776 msg = "Unable to make OFFMol from RDMol: " + msg -> 4777 raise UndefinedStereochemistryError(msg) 4778 4779 @staticmethod

UndefinedStereochemistryError: Unable to make OFFMol from RDMol: Unable to make OFFMol from SMILES: RDMol has unspecified stereochemistry. Undefined chiral centers are:

Computing environment (please complete the following information):

openff-forcefields 2.0.0 pyh6c4a22f_0 conda-forge openff-toolkit 0.10.1 pyhd8ed1ab_0 conda-forge openff-toolkit-base 0.10.1 pyhd8ed1ab_0 conda-forge openforcefield 0.8.4 pyh39e3cac_0 omnia openforcefields 2.0.0rc.2 py_0 omnia

j-wags commented 2 years ago

Thanks for the detailed report, @csu1505110121!

I reproduced the error that you reported and tried to reduce it to the minimum structure that would provide the same issue. I ended up with this:

from openff.toolkit.topology import Molecule

mol = Molecule.from_smiles('C2N1C[C@H](C2)CCC1', 
                            allow_undefined_stereo=True)
mol

Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are:

  • Atom N (index 1)

image

So, the issue is probably that the nitrogen closes two rings, so that makes RDKit think that it may be a chiral center. The commercial OpenEye toolkit also raises the same error, so both RDKit and OpenEye probably have similar logic for this situation. For smaller rings (like the one in my minimal example) I could be convinced that the nitrogen might have stable stereo. But for the macrocycle in the molecule you reported it doesn't seem likely.

So, I think this is a real error, however unfortunately this problem is upstream from us. We've had some plans to make a hacky fix for this locally (cc #146 and #156) but that would likely cause some really complex problems for other users, so it's not high on our priority list.

But there is a simple fix for this error - You can use the allow_undefined_stereo=True keyword when making the molecule. I do this above in the minimal reproducing code. This will downgrade the error into a warning, and while subsequent steps may give more warnings about this molecule, you should be able to get parameters assigned and simulate the molecule :-)

Also, I noticed that you have two versions of the toolkit installed, and that your code is using the older one! You should replace from openforcefield.topology import Molecule with from openff.toolkit.topology import Molecule! More info about this major update can be found at #819.