Changing default configs of the preparator

alfredoq commented 1 year ago

Hello, I am trying to prepare a .pdbqt file with Meeko within a jupyter notebook, and I would like to retain the non-polar hydrogens. In this respect, I assume I need to change the default options passed to the MoleculePreparation library within Meeko. However I am not able to get control on the corresponding parameter. My code is:

_from meeko import MoleculePreparation from rdkit import Chem from rdkit.Chem import AllChem

smiles= "CCCO" m = Chem.MolFromSmiles(smiles) m3d = Chem.AddHs(m) AllChem.EmbedMolecule(m3d)

preparator = MoleculePreparation() opts = preparator.get_defaults_dict() opts["keep_nonpolar_hydrogens"] = True preparator = MoleculePreparation(opts) preparator.prepare(m3d) pdbqt_string = preparator.write_pdbqt_string() pdbqt_string_ready = pdbqt_string.replace('UNL','MOL')

with open('ligand_file.pdbqt','w') as pdbqt_file: pdbqt_file.write(pdbqt_stringready)

I can see that the opt dictionary is indeed being modified, but the non-polar Hs are still being removed. Am I passing the opts in the correct form to MoleculePreparation?

Thanks in advance for the support

alfredoq commented 1 year ago

Hello again, I solved the problem, the passing of the args to MoleculePreparation was missing the ** string. with:

preparator = MoleculePreparation(**opts)

the manipulation of the non-polar Hs worked like a charm.

thanks

diogomart commented 1 year ago

Hi, you can also pass the options directly:

preparator = MoleculePreparation(keep_nonpolar_hydrogens=True)

alfredoq commented 1 year ago

Thank you @diogomart for this hint, I used it to simplify the code. great!

Also I found the _'atom_typesmarts': {} parameter. Is this intended to assign a specific AutoDock atom type to an atom matching a certain environment? For example, of I would like to turn all H atom as donors (HB), I tried the following setting:

_preparator = MoleculePreparation(keep_nonpolar_hydrogens = True, atom_typesmarts = {"smarts": "[#1]","type": "HD"})

however this didnt worked as expected. Am I using the correct format of the dictionary?

thank you again

diogomart commented 1 year ago

Yes atom_type_smarts is for defining atom types. The defaults are here.

Since you'll need also the non-hydrogen types, the easiest may be to save all parameters in a JSON file like this one below (note that all hydrogens are HD now):

{
  "ATOM_PARAMS": {
    "all_HD": [
      {"smarts": "[#1]",              "atype": "HD"},
      {"smarts": "[#5]",              "atype": "B"},
      {"smarts": "[C]",               "atype": "C"},
      {"smarts": "[c]",               "atype": "A"},
      {"smarts": "[#7]",              "atype": "NA"},
      {"smarts": "[#8]",              "atype": "OA"},
      {"smarts": "[#9]",              "atype": "F"},
      {"smarts": "[#12]",             "atype": "Mg"},
      {"smarts": "[#14]",             "atype": "Si"},
      {"smarts": "[#15]",             "atype": "P"},
      {"smarts": "[#16]",             "atype": "S"},
      {"smarts": "[#17]",             "atype": "Cl"},
      {"smarts": "[#20]",             "atype": "Ca"},
      {"smarts": "[#25]",             "atype": "Mn"},
      {"smarts": "[#26]",             "atype": "Fe"},
      {"smarts": "[#30]",             "atype": "Zn"},
      {"smarts": "[#35]",             "atype": "Br"},
      {"smarts": "[#53]",             "atype": "I"},
      {"smarts": "[#7X3v3][a]",       "atype": "N",  "comment": "pyrrole, aniline"},
      {"smarts": "[#7X3v3][#6X3v4]",  "atype": "N",  "comment": "amide"},
      {"smarts": "[#7+1]",            "atype": "N",  "comment": "ammonium, pyridinium"},
      {"smarts": "[SX2]",             "atype": "SA", "comment": "sulfur acceptor"}
    ]
  }
}

If this file is named "typing.json":

import json

with open("typing.json" as f:
    typing_rules = json.load(f)

preparator = MoleculePreparation(atom_type_smarts=typing_rules, keep_nonpolar_hydrogen=True)

alfredoq commented 1 year ago

thank you for the worked example!

kind regards

diogomart commented 1 year ago

You are welcome!

(just edited to include keep_nonpolar_hydrogens=True alongside the modified typing rules)

alfredoq commented 1 year ago

I added this line in the .json file to match the 4-H in a triazole ring, but it didn't matched. The 'atype' flag is applying to the first atom in the smarts string, am I right?

{"smarts": "[#1X1][#6X3][#6X3][#7X3][#7X3][#7X3]", "atype": "HD", "comment": "H4 in triazole ring"},

alfredoq commented 1 year ago

after a while I manage to get the atom type assignment to work. I realized that the matching smarts needs to be placed at the bottom of the dictionary (as indeed the example shows), otherwise the smarts assignment is replaced by the standard [#1] -> "H" assignment (since it hits the following dictionary entries).

My final code is:

_# Create a dictionary with atom typing dictio = { "ATOM_PARAMS": { "defaults+4Htriazole": [ {"smarts": "[#1]", {"smarts": "[#5]", {"smarts": "[C]", {"smarts": "[c]", {"smarts": "[#7]", {"smarts": "[#8]", {"smarts": "[#9]", {"smarts": "[#12]", {"smarts": "[#14]", {"smarts": "[#15]", {"smarts": "[#16]", {"smarts": "[#17]", {"smarts": "[#20]", {"smarts": "[#25]", {"smarts": "[#26]", {"smarts": "[#30]", {"smarts": "[#35]", {"smarts": "[#53]", {"smarts": "[#7X3v3][a]", {"smarts": "[#7X3v3][#6X3v4]", {"smarts": "[#7+1]", {"smarts": "[SX2]", {"smarts": "[#1][#6X3]:[ ] } } "atype": "H",}, "atype": "B"}, "atype": "C"}, "atype": "A"}, "atype": "NA"}, "atype": "OA"}, "atype": "F"}, "atype": "Mg"}, "atype": "Si"}, "atype": "P"}, "atype": "S"}, "atype": "Cl"}, "atype": "Ca"}, "atype": "Mn"}, "atype": "Fe"}, "atype": "Zn"}, "atype": "Br"}, "atype": "I"}, "atype": "N", "comment": "pyrrole, aniline"}, "atype": "N", "comment": "amide"}, "atype": "N", "comment": "ammonium, pyridinium"}, "atype": "SA", "comment": "sulfur acceptor"}, #6X3][#7]:[#7][#7]", "atype": "HD", "comment": "4,5-H in 1,2,3-triazole"},

_preparator = MoleculePreparation(keep_nonpolar_hydrogens = True, atom_typesmarts=dictio)

_smiles = "N1=NNC=C1" m = Chem.MolFromSmiles(smiles) m3d = Chem.AddHs(m) AllChem.EmbedMolecule(m3d) preparator.prepare(m3d) pdbqt_string = preparator.write_pdbqt_string() pdbqt_string_ready = pdbqtstring.replace('UNL','MOL')

_with open('ligand_file.pdbqt','w') as pdbqt_file: pdbqt_file.write(pdbqt_stringready)

worked as required :-)

thank you @diogomart for the valuable help,

best regards

Alfredo

diogomart commented 1 year ago

Glad you got it working, specially considering there's no documentation for this yet (sorry about that).

The 'atype' flag is applying to the first atom in the smarts string, am I right?

Yes. There's an optional IDX field that allows setting the type to other atoms that are not the first. For example, the following would set HD types to the hydrogens in difluoromethane. Note that indices start at 1.

{"smarts": "C(F)(F)([#1])([#1])", "IDX": [4, 5], "atype": "HD"}

I realized that the matching smarts needs to be placed at the bottom

That's correct - subsequent SMARTS matches override previous ones. Therefore, generic SMARTS patterns must precede the specific ones.

alfredoq commented 1 year ago

At this stage, combining the two actions in this thread (i.e. 1- assign a custom atom type to a 4-H in 1,2,3-triazole, 2- preparing the .pdbqt), I find that the H in 1,2,3-trizole is successfully changed to "HD", however after applying the MoleculePreparation function with the "keep_nonpolar_hydrogens = False", I see that the "HD" in the triazole is not presente, This should not be the case, and I suspect that non-polar Hs are removed prior to atom type assignment, am I right?

diogomart commented 1 year ago

non-polar Hs are removed prior to atom type assignment, am I right?

Yes. (actually no but merging ignores types)

I'm guessing the behavior you expect is hydrogens typed H to be merged, and those typed HD to be kept. Is this correct?

diogomart commented 1 year ago

I'm going to discuss with @atillack next week and implement an interface with more control over H merging.

Replacing keep_nonpolar_hydrogens=False with merge_hydrogens_typed="H" comes to mind. This is flexible as any number of SMARTS patterns to type H or otherwise can be set in the atom_type_smarts argument.

alfredoq commented 1 year ago

yes, my idea is to tune the atom type assignment, and afterwards merge nphs as marked in the pdbqt file by the user. In this way I believe I can model non classical bioisosterism during docking (e.g. between a petitidic bond and the 1,2,3-triazole scaffold).

I also observed that if I assign the "HD" hydrogen of an hydroxyl the type "H", after merging it will be kept as marked ("H"). So apparently the merging is driven by some different algorithm than just matching the assigned atom types (the merge_hydrogen you kindly pointed out).

alfredoq commented 1 year ago

I'm going to discuss with @atillack next week and implement an interface with more control over H merging.

Replacing keep_nonpolar_hydrogens=False with merge_hydrogens_typed="H" comes to mind. This is flexible as any number of SMARTS patterns to type H or otherwise can be set in the atom_type_smarts argument.

Thank you very much for considering this point!

diogomart commented 1 year ago

The code is now up on the develop branch. Argument keep_nonpolar_hydrogens does not exist anymore. If you initialize the preparator like this:

preparator = MoleculePreparation(atom_type_smarts=dictio)

it should work for you. Atoms typed H will be merged by default. If for some reason you need to change the type that gets merged, then:

preparator = MoleculePreparation(
    atom_type_smarts=dictio,
    merge_these_atom_types=("HX", "H999", "BAD_H"),
)

alfredoq commented 1 year ago

Thank you very much @diogomart for further developing this feature. I will try it right now.

best regards

alfredoq commented 1 year ago

Dear @diogomart

I cloned the develop brach, and tried to install:

pip install .

however I an getting the following error:

_Processing /home/fredy/Programas/Meeko/Meeko Preparing metadata (setup.py) ... error error: subprocess-exited-with-error

_× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [1 lines of output] error in meeko setup command: 'pythonrequires' must be a string containing valid version specifiers; Invalid specifier: '>=3.5.*' [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

note: This is an issue with the package mentioned above, not pip. hint: See above for details._

Something strange having to do with the checking of the python version?

alfredoq commented 1 year ago

after some reading, I found that the line in setup.py:

_pythonrequires='>=3.5.*',

is failing because of the wildcard (apparently not supported),

I removed it and the package was installed succesfully,

best regards

forlilab / Meeko

Changing default configs of the preparator #46