chemosim-lab / ProLIF

Interaction Fingerprints for protein-ligand complexes and more
https://prolif.readthedocs.io
Apache License 2.0
361 stars 68 forks source link

Run fingerprint over multiple conformations of the same molecule #168

Open forcefield opened 10 months ago

forcefield commented 10 months ago

In Fingerprint.run_from_iterable(), it expects that the iterable yields a molecule. Sometimes we would like to iterate over different conformers of the same molecule, such that the pharmacophores can be perceived once and applied to multiple coordinates.

Can this be done? Ideally, it will be great for lig_iterable to yield either rdkit.Chem.Mol or rdkit.Chem.Conformer.

Currently, I have to do the following

from rdkit.Chem import AllChem

class ConformerIterator:
    def __init__(self, molecule):
        if molecule is None:
            raise ValueError("Invalid molecule")

        self.molecule = molecule
        self._generate_conformers()
        self._current_conf_id = 0

    def _generate_conformers(self, num_confs=10):
        AllChem.EmbedMultipleConfs(self.molecule, numConfs=num_confs, randomSeed=42)

    def __iter__(self):
        # Reset the iterator to the first conformer
        self._current_conf_id = 0
        return self

    def __next__(self):
        if self._current_conf_id < self.molecule.GetNumConformers():
            conformer_mol = Chem.Mol(self.molecule)
            conformer_mol.RemoveAllConformers()
            conformer_mol.AddConformer(self.molecule.GetConformer(self._current_conf_id), assignId=True)
            self._current_conf_id += 1
            return conformer_mol
        else:
            raise StopIteration

# Example usage:
smiles_string = "CCO"
original_molecule = Chem.MolFromSmiles(smiles_string)

conformer_iterator = ConformerIterator(original_molecule)

for conformer_molecule in conformer_iterator:
    # Process each conformer_molecule
    # For example, you can calculate properties or perform other tasks
    print(f"Number of atoms in conformer: {conformer_molecule.GetNumAtoms()}")

Thanks!

forcefield commented 10 months ago

In fact, to make it work with ProLIF, I had to do the following:

class ConformerIterator( object):
    def __init__( self, mol):
        self.molecule = mol
        self._current_conf_id = 0

    def __iter__( self):
        self._current_conf_id = 0
        return self

    def __next__( self):
        if self._current_conf_id < self.molecule.GetNumConformers():
            conformer_mol = Chem.Mol( self.molecule)
            conformer_mol.RemoveAllConformers()
            conformer_mol.AddConformer( self.molecule.GetConformer( self._current_conf_id), assignId=True)
            self._current_conf_id += 1
            return prolif.Molecule.from_rdkit( conformer_mol)
        else:
            raise StopIteration
cbouy commented 10 months ago

Hi @forcefield,

In its current form there's no alternative to providing each conformer as a separate molecule, the pharmacophore search is done at the same stage as the geometry checks in each interaction class. It would require some quite substantial changes (in the interaction classes but also while preparing the prolif.Molecule objects) for this to work unfortunately, and I doubt that I'll have that amount of time in the near future 😕

PS: you can create a copy of a mol with a specific conformer ID with

conformer_mol = Chem.Mol(self.molecule, confId=self._current_conf_id)