Closed peastman closed 1 year ago
Thanks for the report! I can reproduce this and I agree it's a bug. I can't offer a solution or quick workaround at the moment (barring an OpenEye license).
This root of this is that the molecule doesn't survive our round-trips with RDKit:
>>> from openff.toolkit.topology import Molecule, Topology
>>> mol = Molecule.from_smiles('CC1=C/N/C(C#N)=C(C#N)\\N=C/C(C)=C\\N/C(C#N)=C(C#N)/N=C\\1')
>>> mol.to_rdkit()
[11:52:10] Conflicting single bond directions around double bond at index 6.
[11:52:10] BondStereo set to STEREONONE and single bond directions set to NONE.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mwt/miniconda3/envs/openff-system/lib/python3.9/site-packages/openff/toolkit/utils/base_wrapper.py", line 51, in wrapped_function
value = func(*args, **kwargs)
File "/Users/mwt/miniconda3/envs/openff-system/lib/python3.9/site-packages/openff/toolkit/topology/molecule.py", line 5157, in to_rdkit
return toolkit.to_rdkit(self, aromaticity_model=aromaticity_model)
File "/Users/mwt/miniconda3/envs/openff-system/lib/python3.9/site-packages/openff/toolkit/utils/rdkit_wrapper.py", line 1793, in to_rdkit
cls._assign_rdmol_bonds_stereo(molecule, rdmol)
File "/Users/mwt/miniconda3/envs/openff-system/lib/python3.9/site-packages/openff/toolkit/utils/rdkit_wrapper.py", line 2405, in _assign_rdmol_bonds_stereo
assert stereo_rdbond.GetStereo() == desired_rdk_stereo_code
AssertionError
As a separate issue, there are probably other ways to do this "is the molecule already in my set of reference molecules?" check here. This will magically dissipate when the TopologyMolecule
paradigm is deprecated. Changing that logic in the mean time would side-step this issue, except it could push failures to a different stage in a pipeline.
FWIW these behaviors are specific to the RDKit interface; none pop up using OpenEye toolkits. (Not suggesting this as a solution, only as information for diagnosing the issue.)
After some digging, I don't have an answer here. I think the following may be good next directions to pursue:
/
and \
) in the circular structure may make it impossible to represent the desired stereoisomerYeah, super odd that this works using the OpenEyeToolkitWrapper
, but not the RDKitToolkitWrapper
. For reference, the specific molecule in question, loaded using the OpenEyeToolkitWrapper
, visualizes like this:
mol.visualize(backend='openeye')
mol = Molecule.from_smiles('CC1=C/N/C(C#N)=C(C#N)\\N=C/C(C)=C\\N/C(C#N)=C(C#N)/N=C\\1')
Except that 2D structure is wrong. On one side of the macrocycle, the pair of nitrile groups should be trans- to each other according to the input SMILES, and on the other side they should be cis- to each other. However, in the depiction above, BOTH pairs of nitrile substituents are cis- to each other.
If I have OE generate conformers and look in 3D, the correct stereochemistry is obeyed, but the molecule seems very strained (it is folded like a hot dog, which seems to be at odds with its aromaticness).
mol.generate_conformers()
mol.visualize(backend='nglview')
If I chop off all the nitriles, I still get success from OpenEye:
mol = Molecule.from_smiles('CC1=C/N/C=C\\N=C/C(C)=C\\N/C=C/N=C\\1')
and now I also get success from RDKit, though either the drawing code isn't giving me a nice structure or the bond stereo is off:
And I can actually keep two on the nitriles in the structure and be fine, they just can't be next to each other:
mol = Molecule.from_smiles('CC1=C/N/C(C#N)=C\\N=C/C(C)=C\\N/C=C(C#N)/N=C\\1') # works in both, nitriles far from each other
OE:
RDK:
If I put them next to each other, only the OE wrapper works:
mol = Molecule.from_smiles('CC1=C/N/C(C#N)=C(C#N)\\N=C/C(C)=C\\N/C=C/N=C\\1') # nitriles adjacent, breaks rdkitwrapper
For what it's worth, here is the molecule in question: https://pubchem.ncbi.nlm.nih.gov/substance/135240687
Here's another one I found that produces an exception, but this time when I try to canonicalize the atom order.
>>> mol = Molecule.from_smiles('COc1ccc(/N=C(NO)/C(C)=N/N=C/c2ccc(OC)cc2OC)cc1')
>>> mol.canonical_order_atoms()
[08:11:31] Conflicting single bond directions around double bond at index 0.
[08:11:31] BondStereo set to STEREONONE and single bond directions set to NONE.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/peastman/workspace/openff-toolkit/openff/toolkit/topology/molecule.py", line 2603, in __repr__
self.name, self.to_smiles()
File "/home/peastman/workspace/openff-toolkit/openff/toolkit/topology/molecule.py", line 2730, in to_smiles
smiles = to_smiles_method(self, isomeric, explicit_hydrogens, mapped)
File "/home/peastman/workspace/openff-toolkit/openff/toolkit/utils/rdkit_wrapper.py", line 667, in to_smiles
rdmol = self.to_rdkit(molecule)
File "/home/peastman/workspace/openff-toolkit/openff/toolkit/utils/rdkit_wrapper.py", line 1774, in to_rdkit
cls._assign_rdmol_bonds_stereo(molecule, rdmol)
File "/home/peastman/workspace/openff-toolkit/openff/toolkit/utils/rdkit_wrapper.py", line 2382, in _assign_rdmol_bonds_stereo
cls._flip_rdbond_direction(flipped_rdbond, paired_bonds)
File "/home/peastman/workspace/openff-toolkit/openff/toolkit/utils/rdkit_wrapper.py", line 2307, in _flip_rdbond_direction
_flip(rdbond, paired_rdbonds, flipped=set(), ignored=None)
File "/home/peastman/workspace/openff-toolkit/openff/toolkit/utils/rdkit_wrapper.py", line 2281, in _flip
assert b.GetBondDir() in supported_directions
AssertionError
I can no longer reproduce the original behavior
>>> from openff.toolkit import Molecule, Topology, __version__
>>> __version__
'0.12.0'
>>> mol = Molecule.from_smiles('CC1=C/N/C(C#N)=C(C#N)\\N=C/C(C)=C\\N/C(C#N)=C(C#N)/N=C\\1')
>>> top = Topology()
>>> top.add_molecule(mol)
1
>>> from openff.toolkit.utils.toolkits import GLOBAL_TOOLKIT_REGISTRY, OpenEyeToolkitWrapper
>>> GLOBAL_TOOLKIT_REGISTRY.deregister_toolkit(OpenEyeToolkitWrapper)
>>> mol = Molecule.from_smiles('CC1=C/N/C(C#N)=C(C#N)\\N=C/C(C)=C\\N/C(C#N)=C(C#N)/N=C\\1')
>>> top = Topology()
>>> top.add_molecule(mol)
1
>>> GLOBAL_TOOLKIT_REGISTRY.registered_toolkit_versions
{'The RDKit': '2022.09.1', 'AmberTools': '22.0', 'Built-in Toolkit': None}
nor the snippet in the comment just prior to this one:
>>> from openff.toolkit import Molecule, Topology, __version__
>>> __version__
'0.12.0'
>>> mol = Molecule.from_smiles('COc1ccc(/N=C(NO)/C(C)=N/N=C/c2ccc(OC)cc2OC)cc1')
>>>
>>> mol.canonical_order_atoms()
Molecule with name '' and SMILES '[H]c1c(c(c(c(c1/C(=N/N=C(/C(=N/c2c(c(c(c(c2[H])[H])OC([H])([H])[H])[H])[H])/N([H])O[H])\C([H])([H])[H])/[H])OC([H])([H])[H])[H])OC([H])([H])[H])[H]'
>>> from openff.toolkit.utils.toolkits import GLOBAL_TOOLKIT_REGISTRY, OpenEyeToolkitWrapper
>>> GLOBAL_TOOLKIT_REGISTRY.deregister_toolkit(OpenEyeToolkitWrapper)
>>> mol = Molecule.from_smiles('COc1ccc(/N=C(NO)/C(C)=N/N=C/c2ccc(OC)cc2OC)cc1')
>>> mol.canonical_order_atoms()
Molecule with name '' and SMILES '[H][O][N]([H])[C](=[N]\[c]1[c]([H])[c]([H])[c]([O][C]([H])([H])[H])[c]([H])[c]1[H])/[C](=[N]/[N]=[C](\[H])[c]1[c]([H])[c]([H])[c]([O][C]([H])([H])[H])[c]([H])[c]1[O][C]([H])([H])[H])[C]([H])([H])[H]'
>>> >>> GLOBAL_TOOLKIT_REGISTRY.registered_toolkit_versions
{'The RDKit': '2022.09.1', 'AmberTools': '22.0', 'Built-in Toolkit': None}
I'm not sure what exactly fixed this - maybe #1190?
Describe the bug I've run into a strange case where I can create a Molecule without problem, but I get an exception when I try to add it to a Topology.
To Reproduce
Output
Computing environment (please complete the following information):
Ubuntu 18.04
conda list