chemosim-lab / ProLIF

Interaction Fingerprints for protein-ligand complexes and more
https://prolif.readthedocs.io
Apache License 2.0
361 stars 68 forks source link

Getting error when calculating fingerprint #155

Closed Le-Phung-Hien closed 1 year ago

Le-Phung-Hien commented 1 year ago

Hi,

Thanks for the great work.

Im trying to run python interaction_barcode.py -p protein.pdb -l ligands.mol2 using this script: interaction_barcode.py.txt

With this ligands file: ligands.mol2.txt And protein file: protein.pdb.txt

Then I get this error:

 0%|                                                                                                                                        | 0/3 [00:00<?, ?it/s]
Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 4 5 6 7 9
  0%|                                                                                                                                        | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/d/Screening/RESULT/interaction_barcode.py", line 35, in <module>
    fp.run_from_iterable(pose_iterable, protein_mol)
  File "/home/lephunghien/miniconda3/envs/prolif/lib/python3.11/site-packages/prolif/fingerprint.py", line 596, in run_from_iterable
    return self._run_iter_parallel(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lephunghien/miniconda3/envs/prolif/lib/python3.11/site-packages/prolif/fingerprint.py", line 629, in _run_iter_parallel
    for i, ifp_data in enumerate(pool.process(lig_iterable)):
  File "/home/lephunghien/miniconda3/envs/prolif/lib/python3.11/site-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/home/lephunghien/miniconda3/envs/prolif/lib/python3.11/site-packages/multiprocess/pool.py", line 873, in next
    raise value
AttributeError: 'NoneType' object has no attribute 'GetAtomWithIdx'

Pretty sure there are something wrong with my code but I can't figure out where. Could you please advise me where is the mistake?

Thanks!

Thanks!

rwxayheee commented 1 year ago

Hi @Le-Phung-Hien,

Screenshot 2023-09-11 at 10 47 18 AM

I took a quick look at the ligand in PyMol. Given that the geometry and number of hydrogens are correct, there are two possible chemical structures I can think of: Untitled ACS Document 1996-1 In the quinonlone form (a): C7, C8, C9 and N10 are not aromatic. The atom type and bond order for these atoms will need to be corrected accordingly. In the hydroxy quinoline form (b): The bond between C9 and O11 is a single bond. The atom type and bond order for O11 will need to be corrected.

Without the N-methylation, quinonlone (a) and hydroxy quinoline (b) are tautomers. The N-methylation make them become resonance structures. In the end, this is more of a "bond order and atom type assignment problem". You can pick the assignment that aligns with your expectation/understanding of the ligand. Natural bond order and population analysis might help if you are interested in the electronic structure of this core.

cbouy commented 1 year ago

Thanks @rwxayheee for the help 🙌 !

RDKit is having trouble making sense out of that MOL2 file for the reasons mentioned above so it returns None instead of a molecule object.

Here's a dirty workaround to sanitize mols with this pattern using a reaction:

import prolif as plf
from rdkit import Chem
from rdkit.Chem import rdChemReactions

def patched_block_to_mol(self, block):
    mol = Chem.MolFromMol2Block("".join(block), removeHs=False, sanitize=False)
    try:
        Chem.SanitizeMol(mol)
    except Chem.KekulizeException:
        rxn = rdChemReactions.ReactionFromSmarts("[c&$(c1aa[n+]aa1):1]=[O:2]>>[c:1]-[O-:2]")
        rxn.RunReactantInPlace(mol)
        Chem.SanitizeMol(mol)
    return plf.Molecule.from_rdkit(mol, **self._kwargs)

plf.mol2_supplier.block_to_mol = patched_block_to_mol

Put this towards the top of your script before any call to mol2_supplier and you should be good to go (at least for mols with this pattern)

Le-Phung-Hien commented 1 year ago

Thanks a lot @cbouy @rwxayheee .