Currently, I am experimenting with providing a separate docking template to the OEPositDockingFeaturizer in the posit_template branch. This way you could dock into a structure from PDB entry 4aoj but bias the Posit docking algorithm by the ligand co-crystallized in another PDB entry, e.g. 4yne.
I pass the required information as attributes to the corresponding ligand and protein instances of the protein-ligand complex (see code example below). The Featurizer should be able to read these attributes to do a proper job. This process always worked out fine when passing the attributes to protein only in other Featurizers. However, adding additional attributes to the ligand instance gives surprising results when using multiprocessing. Anything else, but the smiles and name attributes (given during initialization) are lost.
from kinoml.core.components import BaseProtein
from kinoml.core.ligands import Ligand
from kinoml.core.systems import ProteinLigandComplex
from kinoml.features.complexes import OEPositDockingFeaturizer
compounds = {
"larotrectinib": "C1CC(N(C1)C2=NC3=C(C=NN3C=C2)NC(=O)N4CCC(C4)O)C5=C(C=CC(=C5)F)F",
"selitrectinib": "CC1CCC2=C(C=C(C=N2)F)C3CCCN3C4=NC5=C(C=NN5C=C4)C(=O)N1"
}
systems = []
for name, smiles in compounds.items():
protein = BaseProtein(name="NTRK1")
protein.pdb_id = "4aoj"
protein.expo_id = "V4Z"
protein.chain_id = "A"
ligand = Ligand.from_smiles(smiles=smiles, name=name)
ligand.docking_template_pdb_id = "4yne" # lost in multiprocessing
ligand.docking_template_expo_id = "4EK" # lost in multiprocessing
ligand.docking_template_chain_id = "A" # lost in multiprocessing
systems.append(ProteinLigandComplex(components=[protein, ligand]))
featurizer = OEPositDockingFeaturizer(output_dir="posit", use_multiprocessing=True)
systems = featurizer.featurize(systems)
Just googling this behavior gave me a few hints. It looks like, there may be a serialization problem.
Interestingly, this is not a problem when using the RDKitLigand class instead of the Ligand class to store the attributes. Since the Ligand class is based on the _OpenForceFieldMolecule class, the problem may arise on their end.
Currently, I am experimenting with providing a separate docking template to the OEPositDockingFeaturizer in the
posit_template
branch. This way you could dock into a structure from PDB entry 4aoj but bias the Posit docking algorithm by the ligand co-crystallized in another PDB entry, e.g. 4yne.I pass the required information as attributes to the corresponding ligand and protein instances of the protein-ligand complex (see code example below). The Featurizer should be able to read these attributes to do a proper job. This process always worked out fine when passing the attributes to protein only in other Featurizers. However, adding additional attributes to the ligand instance gives surprising results when using multiprocessing. Anything else, but the smiles and name attributes (given during initialization) are lost.
Just googling this behavior gave me a few hints. It looks like, there may be a serialization problem.
Interestingly, this is not a problem when using the
RDKitLigand
class instead of theLigand
class to store the attributes. Since theLigand
class is based on the_OpenForceFieldMolecule
class, the problem may arise on their end.