forlilab / Meeko

Interfacing RDKit and AutoDock
GNU Lesser General Public License v2.1
192 stars 48 forks source link

mk_prepare_ligand.py Issue #76

Closed heqing-chriscao closed 1 month ago

heqing-chriscao commented 11 months ago

Hello,

When I use "mk_prepare_ligand.py" to prepare a ligand, usually one of the two following problems occur:

"CompletedProcess(args='mk_prepare_ligand.py -i 1exd_aptamer_Model_1H.sdf -o 1exd_aptamer_Model_1H.pdbgt', returncode=0, stdout='', stderr='[20:26:47] Explicit v alence for atom # 549 H, 2, is greater than permitted\n[20:26:47] ERROR: Could not sanitize molecule ending on line 4891\n[20:26:47] ERROR: Explicit valence for atom # 549 H, 2, is greater than permitted\n')

"CompletedProcess(args='mk_prepare_ligand.py -i 1rpu_aptamer_Model_1H.sdf -o 1rpu_aptamer_Model_1H.pdbqt', returncode=0, stdout-'', stderr-'molecule has implicit hydrogens (name=1rpu_aptamer_Model_1H)\n\n')"

Before using mk_prepare_ligand.py, I use Reduce to add hydrogens to the ligand as suggested by AutoDock Vina, and Pymol to convert it to sdf file. I think something off happens during the first step as all errors are due to newly added hydrogens. I wonder if it is the problem of Reduce. If this is the case, are there any better alternatives for adding hydrogen?

The two erroneous files and their hydrogen-added version are attached here. Erroneou Files.zip

Any insight is appreciated. Thank you so much in advance.

diogomart commented 11 months ago

Hi,

The usual ligand for autodock is a drug-like molecule, typically with much less than 50 heavy atoms. These are nucleic acids, so much larger than the typical ligand. We are re-designing the receptor preparation and I think it could be helpful to parameterize these nucleic acids and still try to dock them rigidly. I'll try to remember to get back here once that is released, but I'll probably forget, and then I'll rediscover this when I go over open issues, so feel free to ping back here in about a couple weeks if you are still interested.

As for the errors you are getting, the first is at the RDKit level - there's a hydrogen with two bonds, and the second is meeko rejecting an RDKit molecule with implicit Hs.

rwxayheee commented 11 months ago

Hi @heqing-chriscao, Another problem is that 1rpu_aptamer_Model.pdb includes conformations with partial occupancies, which lead to duplicated heavy atoms. When there are two conformations, reduce tries to protonate both and create duplicate hydrogens. I recommend splitting the two conformations (A and B) before running reduce on them. And see if the generated SDF can pass the valence check.

heqing-chriscao commented 10 months ago

Hi @diogomart,

Thank you for your reply and for your information about the re-design of the receptor preparation. Does it mean, for now, our docking results are not accurate as our ligands are much larger than meeko expected?

For meeko rejecting an RDKit molecule with implicit Hs, are there any ways to fix the implicit hydrogens? I think after adding H's correctly, there shouldn't be any implicit hydrogens. Do you have any suggestions on the methods of adding hydrogens before using meeko?

Thank you so much.

heqing-chriscao commented 10 months ago

@rwxayheee

Thank you for pointing that out. I have checked some of the files that have the valance issue, and I find most of them indeed have multiple conformations. I will split them before docking them.

For the implicit H's issue, will you get the same error after using reduce?

Thanks.

rwxayheee commented 10 months ago

Hi @heqing-chriscao,

For the implicit H's issue, will you get the same error after using reduce?

I repeated your way of making the SDF file, and I found the implicit Hs. It is possible to generate the PDBQT file after correcting the charges (or add hydrogens if you prefer to).

Reduce did not protonate the 5' end phosphate groups, so for 1rpu_aptamer_Model, which is a double-strand RNA (21 nucleotides + 21 nucleotides), the total net charge should be -44 ( -1 for each phosphodiester bond and -2 for the terminal phosphate). However, in PyMOL's SDF file, only 42 are noted with CHG=-1, one from each phosphodiester bond or phosphate group.

Meeko will accept a correctly charged SDF file or RDKit molecule. You can do this by manually correct the SDF file or assign formal charge in RDKit:

import meeko, rdkit
from rdkit import Chem
from rdkit.Chem import rdmolops
from meeko import MoleculePreparation
from meeko import PDBQTWriterLegacy

# referece: https://gist.github.com/greglandrum/7f546d1e35c2df537c68a64d887793b8
def add_formal_charges(m):
    m.UpdatePropertyCache(strict=False)
    for at in m.GetAtoms():
        # [O-]
        if at.GetAtomicNum() in [8] and at.GetExplicitValence()==1 and at.GetFormalCharge()==0:
            at.SetFormalCharge(-1)

AH_from_sdf = Chem.SDMolSupplier('1rpu_aptamer_Model_AH.sdf', removeHs=False)[0]
print("Formal Charge from Input:", rdmolops.GetFormalCharge(AH_from_sdf)) # will get -42

# Correct charge and update valence properties
add_formal_charges(AH_from_sdf)
AH_from_sdf.UpdatePropertyCache(strict=False)

print("Formal Charge after Correction:", rdmolops.GetFormalCharge(AH_from_sdf)) # will get -44

# Generate ligand pdbqt string
preparator = MoleculePreparation()
mol_setups = preparator.prepare(AH_from_sdf)
for setup in mol_setups:
    pdbqt_string = PDBQTWriterLegacy.write_string(setup)[0]

# Write pdbqt string to file
with open('1rpu_aptamer_Model_AH.pdbqt','w') as fw:
    for line in pdbqt_string:
        fw.write(line)

But.! I agree with Diogo that it might not be feasible to dock a ligand with this many free torsions. Preparing it as a receptor (and dock rigidily) is also possible if you wish. I attached my files in case you're still interested in making a ligand PDBQT and seeing how many torsions it might have. Hope they are still helpful..

Input 1rpu_aptamer_Model_AH.sdf.txt Output 1rpu_aptamer_Model_AH.pdbqt.txt

ps. I split A/B before running reduce

heqing-chriscao commented 10 months ago

Thank you so much @rwxayheee!

The code you given is working well on my end. I will try to learn to how to mannually fix the other files with the same issue.