openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
318 stars 92 forks source link

Problem generating conformer when stereochemistry explicitly defined #669

Open trevorgokey opened 4 years ago

trevorgokey commented 4 years ago

Describe the bug Generating a conformer for this SMILES fails in both OE/RD, but it seems like all stereocenters are specified:

from  openforcefield.topology.molecule import Molecule
mol = Molecule.from_smiles('C=CCn1c([C@@H]2C[C@@H]3CC[C@@H]2O3)nnc1N1CCN(c2ccccc2)CC1')
mol.generate_conformers()

For RDKit, it silently fails (and is the reason I am submitting this bug report):

~/projects/openforcefield/openforcefield/topology/molecule.py in generate_conformers(self, toolkit_registry, n_conformers, rms_cutoff, clear_existing)
   2604                 n_conformers=n_conformers,
   2605                 rms_cutoff=rms_cutoff,
-> 2606                 clear_existing=clear_existing,
   2607             )
   2608         elif isinstance(toolkit_registry, ToolkitWrapper):

~/projects/openforcefield/openforcefield/utils/toolkits.py in call(self, method_name, raise_exception_types, *args, **kwargs)
   4413                     for exception_type in raise_exception_types:
   4414                         if isinstance(e, exception_type):
-> 4415                             raise e
   4416                     errors.append((toolkit, e))
   4417 

~/projects/openforcefield/openforcefield/utils/toolkits.py in call(self, method_name, raise_exception_types, *args, **kwargs)
   4409                 method = getattr(toolkit, method_name)
   4410                 try:
-> 4411                     return method(*args, **kwargs)
   4412                 except Exception as e:
   4413                     for exception_type in raise_exception_types:

~/projects/openforcefield/openforcefield/utils/toolkits.py in generate_conformers(self, molecule, n_conformers, rms_cutoff, clear_existing)
   2854             molecule._conformers = list()
   2855 
-> 2856         for conformer in molecule2._conformers:
   2857             molecule._add_conformer(conformer)
   2858 

TypeError: 'NoneType' object is not iterable

and for OE we have some logic in the TK I don't fully understand:

       # Set atom stereochemistry now that all connectivity is in place
       for atom, oeatom in zip(molecule.atoms, oemol_atoms):
           if not atom.stereochemistry:
               continue

           # Set arbitrary initial stereochemistry
           neighs = [n for n in oeatom.GetAtoms()]
           oeatom.SetStereo(neighs, oechem.OEAtomStereo_Tetra,
                            oechem.OEAtomStereo_Right)

           # Flip chirality if stereochemistry isincorrect
           oeatom_stereochemistry = OpenEyeToolkitWrapper._openeye_cip_atom_stereochemistry(
               oemol, oeatom)
           if oeatom_stereochemistry != atom.stereochemistry:
               # Flip the stereochemistry
               oeatom.SetStereo(neighs, oechem.OEAtomStereo_Tetra,
                                oechem.OEAtomStereo_Left)
               # Verify it matches now as a sanity check
               oeatom_stereochemistry = OpenEyeToolkitWrapper._openeye_cip_atom_stereochemistry(
                   oemol, oeatom)
               if oeatom_stereochemistry != atom.stereochemistry:
                   raise Exception(
                       'Programming error: OpenEye atom stereochemistry assumptions failed.'
                   )

It throws the above exception during conformer generation (openforcefield/openforcefield/utils/toolkits.py:1377)

Is there something I am missing about this molecule that makes stereochemistry ambiguous/difficult to generate a conformer for? I'll think about it more, but wanted to post for extra eyes.

Expected output

For RDKit, a more descriptive error. For OE, a description of the assumptions that were in place that failed.

Additional context The SMILES above with chirality specified was generated from RDKit isomer enumeration based on C=CCN1C(C2CC3CCC2O3)=NN=C1N1CCN(C2=CC=CC=C2)CC1 which is an entry in the Enamine REAL dataset.

mattwthompson commented 4 years ago

I can reproduce you getting the firstAre you getting Exception: OpenEye Omega conformer generation failed or Programming error: OpenEye atom stereochemistry assumptions failed.? The exception I'm hitting is in generate_conformers (L1787), not the snippet you posted in from_openeye, which gets called immediately after the exception I'm hitting.

----> 1 mol.generate_conformers(n_conformers=1, toolkit_registry=ToolkitRegistry(toolkit_precedence=[OpenEyeToolkitWrapper]))

~/software/openforcefield/openforcefield/topology/molecule.py in generate_conformers(self, toolkit_registry, n_conformers, rms_cutoff, clear_existing)
   2394                                          n_conformers=n_conformers,
   2395                                          rms_cutoff=rms_cutoff,
-> 2396                                          clear_existing=clear_existing)
   2397         elif isinstance(toolkit_registry, ToolkitWrapper):
   2398             toolkit = toolkit_registry

~/software/openforcefield/openforcefield/utils/toolkits.py in call(self, method_name, raise_exception_types, *args, **kwargs)
   4490                     for exception_type in raise_exception_types:
   4491                         if isinstance(e, exception_type):
-> 4492                             raise e
   4493                     errors.append((toolkit, e))
   4494

~/software/openforcefield/openforcefield/utils/toolkits.py in call(self, method_name, raise_exception_types, *args, **kwargs)
   4486                 method = getattr(toolkit, method_name)
   4487                 try:
-> 4488                     return method(*args, **kwargs)
   4489                 except Exception as e:
   4490                     for exception_type in raise_exception_types:

~/software/openforcefield/openforcefield/utils/toolkits.py in generate_conformers(self, molecule, n_conformers, rms_cutoff, clear_existing)
   1784             new_status = omega(oemol)
   1785             if new_status is False:
-> 1786                 raise Exception("OpenEye Omega conformer generation failed")
   1787
   1788

Exception: OpenEye Omega conformer generation failed

So you're getting OpenEye to generate conformers, but getting weird stereochemistry out, whereas I'm just not getting it conformers at all? Can you set a trace to see if molecule2._conformers exists? What OpenEye version are you using? I'm getting the same behavior on 2019 and 2020.

The same thing happens in RDKit but there isn't an exception when molecule2._conformers is None. A patch like ec493cf2054bff922e68785c5cfa4f123990cdd2 would catch this with a more specific exception, but that doesn't really help answer the question of why conformer generation silently failed in the first place; I would expect RDKit to complain.

trevorgokey commented 4 years ago

I am using OE 2020 with OpenFF TK 0.7.0. I am getting Programming error: OpenEye atom stereochemistry assumptions failed. which seems to occur before the exception you're getting.

In the OE codeblock I presented above, which tries both OEAtomStereo_{Right,Left}, I should note that atom.stereochemistry is already set to 'S', so checking both seems odd. Further, when

1373 oeatom_stereochemistry = OpenEyeToolkitWrapper._openeye_cip_atom_stereochemistry(oemol, oeatom)

is called, Warning: The aromaticity of the molecule needs to be perceived first. is emitted and oeatom_stereochemistry is None after execution. The warning is specifically returned when calling cip = oechem.OEPerceiveCIPStereo(oemol, oeatom) (openforcefield/openforcefield/utils/toolkits.py:1018). So what is odd to me is that atom.stereochemistry is clearly set to S, but the OETK is completely confused. Not sure where atom.stereochemistry is set at the moment.

Given that RDkit is mute on the matter, if we can pin down the OE case we might learn something transferable to solving the RD case.

j-wags commented 4 years ago

Speaking with @trevorgokey now -- The comment just above this is caused by the 2020 OETK changes and a generic "Openeye Omega failed" message is received if using OFFTK 0.7.1 or master.

trevorgokey commented 4 years ago

Yes, pulling master got me to the other error. We are currently in the mindset that given explicitly defined SMILES (all centers clockwise) is impossible, and is therefore an issue upstream with the RDkit stereoisomer enumerator.

jthorton commented 4 years ago

On master with openeye-toolkits 2020.0.4 I don't get that molecule when using mol.enumerate_stereoisomers(rationalise=True) this flag tries to generate a conformer for the molecule under the hood and if it fails we assume it's not possible, I only get the following molecules C=CCN1C(=NN=C1N2CCN(CC2)c3ccccc3)[C@@H]4C[C@H]5CC[C@@H]4O5 C=CCN1C(=NN=C1N2CCN(CC2)c3ccccc3)[C@H]4C[C@H]5CC[C@@H]4O5 C=CCN1C(=NN=C1N2CCN(CC2)c3ccccc3)[C@@H]4C[C@@H]5CC[C@H]4O5 C=CCN1C(=NN=C1N2CCN(CC2)c3ccccc3)[C@H]4C[C@@H]5CC[C@H]4O5 There should be a similar flag in RDKit which should stop unphysical molecules coming up, but for this case with rdkit as the backend, I get no stereoisomers for this molecule so something is not right there.