OpenFreeEnergy / gufe

grand unified free energy by OpenFE
https://gufe.readthedocs.io
MIT License
28 stars 7 forks source link

SmallMoleculeComponent.to_openff() does not preserve the molecule protonation state #297

Open LilDojd opened 3 months ago

LilDojd commented 3 months ago

Expected Behavior

All explicit hydrogens are conserved when converting to OpenFF Molecule

Current Behavior

OFFMolecule is constructed directly from SmallMoleculeComponent._rdkit with __init__:

https://github.com/OpenFreeEnergy/gufe/blob/29f98ec0227ed9f30ea6b84096dc13818a9ac8d3/gufe/components/smallmoleculecomponent.py#L183-L196

This essentially calls

OFFMolecule.from_rdkit(rdmol, allow_undefined_stereo=True, hydrogens_are_explicit=False)
which leads to rdkit adding hydrogens

Possible Solution

Use .from_rdkit() constructor directly with hydrogens_are_explicit set to True

Steps to Reproduce

  1. Use sdf in gdp_correct.sdf.zip
gdp = SmallMoleculeComponent.from_sdf_file("gdp_correct.sdf")
gdp.to_openff().to_file("gdp_ugly.sdf", "sdf")

> gdp_ugly.sdf:

LilDojd commented 3 months ago

I know this particular SDF has incorrect bond orders, but I think users' input protonation state should still generally be respected