openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
313 stars 92 forks source link

Atom stereochemistry lost in `Molecule.to_openeye` #1219

Closed mattwthompson closed 2 years ago

mattwthompson commented 2 years ago

Describe the bug Stereochemistry can be lost in Molecule.to_openeye in potentially weird cases. This test has been failing in CI for some amount of time. I have yet to bisect, so I don't know when this was introduced.

To Reproduce

python -m pytest -v openff/toolkit/tests/test_molecule.py::TestMolecule::test_chemical_environment_matches_OE

or this script, adapted from the unit test

from openff.toolkit import Molecule
from openff.toolkit.utils import OpenEyeToolkitWrapper, RDKitToolkitWrapper

molecule = Molecule()
atom_C = molecule.add_atom(6, 0, False, stereochemistry="R", name="C")
atom_H = molecule.add_atom(1, 0, False, name="H")
atom_Cl = molecule.add_atom(17, 0, False, name="Cl")
atom_Br = molecule.add_atom(35, 0, False, name="Br")
atom_F = molecule.add_atom(35, 0, False, name="F")
molecule.add_bond(atom_C, atom_H, 1, False)
molecule.add_bond(atom_C, atom_Cl, 1, False)
molecule.add_bond(atom_C, atom_Br, 1, False)
molecule.add_bond(atom_C, atom_F, 1, False)

molecule.to_smiles(toolkit_registry=RDKitToolkitWrapper())
# '[H][C]([Cl])([Br])[Br]'

print(
    [
        atom.GetProp("_CIPCode")
        for atom in molecule.to_rdkit().GetAtoms()
        if atom.HasProp("_CIPCode")
    ]
)
# ['R']

oemol = molecule.to_openeye()
# openff.toolkit.utils.exceptions.InconsistentStereochemistryError: Programming error: OpenEye atom
# stereochemistry assumptions failed. The atom in the oemol has stereochemistry None and the atom in
# the offmol has stereoheometry R.

Computing environment (please complete the following information): Based off of https://github.com/openforcefield/openff-toolkit/pull/1218 at 5fce37119ffe3720bf6177a1b483190bf41ec245

Additional context I don't think #1184 is relevant, not because I understand it deeply but because it has RDKit in the title and this behavior only errors out with to_openeye. Unless it should error out with both.

I did only a cursory scan of OpenEye's release notes but nothing stood out to me.

mattwthompson commented 2 years ago

There's no point in bisecting, this fails a while back in the history. I arbitrarily went back to 0.10.1 and it fails. It also fails if I downgrade to 2020.2.4, 2020.2.0 and 2020.0.4 from there . Then, still on 2020.0.4, I dropped back to 0.10.0 and 0.9.2 and it fails with those version combinations as well.

The test itself is virtually untouched since 2019 or so. The only recent change I can find is here in #1036 but that doesn't seem aligned with the stereochemistry issue. The code in OpenEyeToolkitWrapper.to_openeye() is functionally untouched for several months to a few years.

it fails I'm very confused as to how this was ever passing - I don't recall this test being turned off and on recently.

mattwthompson commented 2 years ago

The problem here was that I'm not very smart:


atom_F = molecule.add_atom(35, 0, False, name="F")