OpenFreeEnergy / gufe

grand unified free energy by OpenFE
https://gufe.readthedocs.io
MIT License
28 stars 8 forks source link

AttributeError with GUFE 0.8 + RDKit 2022 #201

Closed craabreu closed 1 year ago

craabreu commented 1 year ago

As RDKit's version is not pinned, GUFE can be installed along with RDKit 2022.03.2. However, the following code uses attributes of rdkit.Chem.rdchem.ChiralType only available in latter versions:

https://github.com/OpenFreeEnergy/gufe/blob/b9f76f3af63aec48d7bba252811f915f4f1e58a4/gufe/components/smallmoleculecomponent.py#L16-L25

As an alternative to restricting RDKit's version, I'd suggest doing something like the following:

_INT_TO_ATOMCHIRAL = { 
    0: Chem.rdchem.ChiralType.CHI_UNSPECIFIED, 
    1: Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CW, 
    2: Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CCW, 
    3: Chem.rdchem.ChiralType.CHI_OTHER,
}
if hasattr(Chem.rdchem.ChiralType, "CHI_TETRAHEDRAL"):
    _INT_TO_ATOMCHIRAL.update(
        {
            4: Chem.rdchem.ChiralType.CHI_TETRAHEDRAL, 
            5: Chem.rdchem.ChiralType.CHI_ALLENE, 
            6: Chem.rdchem.ChiralType.CHI_SQUAREPLANAR, 
            7: Chem.rdchem.ChiralType.CHI_TRIGONALBIPYRAMIDAL, 
            8: Chem.rdchem.ChiralType.CHI_OCTAHEDRAL,
        }
    )
dwhswenson commented 1 year ago

Thanks for catching this, and for the very helpful report. I think we might end up needing to pin RDKit. The code in question is for serialization. The proposed fix will at least prevent errors on startup, but would leave open the following problem:

  1. On a machine with a newer RDKit installed, I write out a transformation (as a JSON file) to be run on my cluster. I transfer that to the cluster.
  2. The cluster has an older RDKit installed, and my jobs crash immediately with a KeyError, because values 4-8 of this enum haven't been loaded.

But @richardjgowers may have ideas for a better solution -- he's the one who has spent the most time in this part of the code.

richardjgowers commented 1 year ago

thanks @craabreu yes I think the probability of actually encountering nontetrahedral stereo is low enough that it's not worth the pin so I've added your suggestion

mikemhenry commented 1 year ago

I think I rather pin RDKit than run into an issue where we hit key errors.

What is the problem with the pin?

craabreu commented 1 year ago

I'm currently forced to pin GUFE to v0.6.1 because updating RDKit would mess with other people's work.

As a compromise solution, how about throwing an exception when someone tries to deserialize an SMC with an unhandled key? Something to the effect of "deserializing this object requires rdkit>=2022.09.1".

richardjgowers commented 1 year ago

@craabreu I think this should be fixed for you with a recent release of gufe, we shouldn't have any difficult pins on rdkit any more