Although we typically serialize GufeTokenizables to dict, keyed_dict, or KeyedChain forms turned to JSON, it can also be convenient to pickle these objects in certain cases. When using multiprocessing, for example, GufeTokenizables passed as arguments to functions executed on other processes are typically pickled/unpickled as the serialization approach.
This presents an issue for ExplicitMoleculeComponents: RDKit Mol objects do not by default include atom properties when pickled (https://github.com/rdkit/rdkit/issues/6573#issuecomment-1781734093). This behavior causes issues especially for preserving partial charges, since we use RDKit Mol properties for holding on to these.
It's possible to change this behavior globally for RDKit with:
from rdkit import Chem
Chem.SetDefaultPickleProperties(Chem.PropertyPickleOptions.AllProps)
Should we set this within gufe so as to avoid this issue across the board, or will this have undesirable consequences?
Although we typically serialize
GufeTokenizable
s todict
,keyed_dict
, orKeyedChain
forms turned to JSON, it can also be convenient to pickle these objects in certain cases. When usingmultiprocessing
, for example,GufeTokenizable
s passed as arguments to functions executed on other processes are typically pickled/unpickled as the serialization approach.This presents an issue for
ExplicitMoleculeComponent
s: RDKitMol
objects do not by default include atom properties when pickled (https://github.com/rdkit/rdkit/issues/6573#issuecomment-1781734093). This behavior causes issues especially for preserving partial charges, since we use RDKitMol
properties for holding on to these.It's possible to change this behavior globally for RDKit with:
Should we set this within
gufe
so as to avoid this issue across the board, or will this have undesirable consequences?