ZimmermanGroup / py-conformational-sampling

MIT License
8 stars 1 forks source link

investigate why stereochemistry is not being preserved when embedding conformers with rdkit #14

Closed joshkamm closed 2 years ago

joshkamm commented 2 years ago

I spent some time trying to understand what was going on and looking through the documentation. It seems that rdkit doesn't automatically discover and assign chiral atoms and that this must be done explicitly before calling EmbedMultipleConfs.

It looks like rdkit has recent activity in this area: rdkit/rdkit#5330

A few options going forward:

I'm currently trying out openbabel confab and it doesn't appear to have a parallel implementation but at least it seems to naturally preserve stereochemistry. The openbabel python API is built with SWIG which is much less friendly with communicating data structures than rdkit's boost.python.

joshkamm commented 2 years ago

Test conformer generation code with openbabel and rdkit for future reference:

# EXPERIMENTING WITH RDKIT CONFORMER GENERATION
rdkit_mol = stk_ligand.to_rdkit_mol()
from rdkit import Chem
from rdkit.Chem import AllChem
print(Chem.FindMolChiralCenters(rdkit_mol, includeUnassigned=True, useLegacyImplementation=False))
# AllChem.AssignStereochemistryFrom3D(rdkit_mol, confId=0, replaceExistingTags=True)
si = Chem.FindPotentialStereo(rdkit_mol)
for element in si:
    print(f'  Type: {element.type}, Which: {element.centeredOn}, Specified: {element.specified}, Descriptor: {element.descriptor} ')
pass

# EXPERIMENTING WITH OPEN BABEL CONFORMER GENERATION
from openbabel import pybel as pb
pybel_mol = next(pb.readfile('xyz', str(ligand_path)))
# print(len(pybel_mol.conformers))
cs = pb.ob.OBConformerSearch()
cs.Setup(pybel_mol.OBMol, 10, 5, 5, 5) # numConformers, numChildren, mutability, convergence
cs.Search()
cs.GetConformers(pybel_mol.OBMol)
print(pybel_mol.OBMol.NumConformers())
stk_conformers = []
for i in range(pybel_mol.OBMol.NumConformers()):
    pybel_mol.OBMol.SetConformer(i)
    stk_conformers.append(pybel_mol_to_stk_mol(pybel_mol))
stk_list_to_xyz_file(stk_conformers, 'test_pybel_generation.xyz')    rdkit_mol = stk_ligand.to_rdkit_mol()
    conf_ids = Chem.AllChem.EmbedMultipleConfs(
        rdkit_mol,
        numConfs=config.initial_conformers,
        randomSeed=40,
        pruneRmsThresh=config.initial_rms_threshold,
        numThreads=config.num_cpus
    )
    stk_conformers = [stk_ligand.with_position_matrix(rdkit_mol.GetConformer(conf_id).GetPositions())
                   for conf_id in conf_ids]

Previous code for using rdkit to generate conformers:

    rdkit_mol = stk_ligand.to_rdkit_mol()
    conf_ids = Chem.AllChem.EmbedMultipleConfs(
        rdkit_mol,
        numConfs=config.initial_conformers,
        randomSeed=40,
        pruneRmsThresh=config.initial_rms_threshold,
        numThreads=config.num_cpus
    )
    stk_conformers = [stk_ligand.with_position_matrix(rdkit_mol.GetConformer(conf_id).GetPositions())
                   for conf_id in conf_ids]