forlilab / Meeko

Interface for AutoDock, molecule parameterization
https://meeko.readthedocs.io/
GNU Lesser General Public License v2.1
204 stars 49 forks source link

Fixes to export covalent (two-point attached) docking results #219

Closed rwxayheee closed 2 weeks ago

rwxayheee commented 3 weeks ago

The PR has two fixes to export covalent (two-point attached) docking results:

20a6a86 The order of flexible sidechain information in the BEGIN_RES remark is different, when generated from mk_prepare_ligand.py with CovalentBuilder:

BEGIN_RES A HIS 309

fe976df+56f7f31 Adds a checks to mk_export.py and prints warning when sdf_string is empty. If SDF is the only file asked to generate, exits and print a suggestion to use -k to retain flexres

rwxayheee commented 3 weeks ago

Processing covalent (tethered) docking as a special type of flexible docking might need a little bit more work. I will look at the export codes. Converting this to draft

rwxayheee commented 3 weeks ago

I will keep working on this because it might be a useful structure to have, when we want to modify the chemical identity a chorizo residue (like merging with a covalent ligand).

But we might want to put a note for people who's doing the covalent docking, that the poses are not clustered in AutoDock-GPU (not quite sure about ranking, because total free energy of binding will always be 0) and many arguments (cluster_lead, etc.) will not work as expected. If we don't want to support it anyone, I will drop it from the Colab notebook

rwxayheee commented 3 weeks ago

06fd1ce + 903287f

are the minimal fix I can think of to add covalent flexible residue as a special case in export_pdb_updated_flexres.

It's a little complicated and maybe more than a special case for an export function, because it requires modification of residues in polymer (a LinkedRDKitChorizo object). The modification could have been a separate function, and not to be repeated in each iteration.

The other problem I had was that pdbqt_mol (a PDBQTMolecule instance) has an __iter__ method and the constructor function RDKitMolCreate.from_pdbqt_mol always iterates through it.

To address both problems, I used deepcopy just to avoid changing the original objects. It's very inefficient since export_pdb_updated_flexres is inside the loop that iterates poses to export. Maybe what I wanted to do was implemented on the wrong level, not in the right place.

There's a remaining issue: mk_export.py option --all_dlg_poses currently only affects the SDF writing, not the poses to write to PDB. From a quick look I don't have a simple fix for that, because PDBQTMolecule doesn't seem to have an equivalent option likeonly_cluster_leads.

@diogomart please let me know what you think, but there's no rush to merge. We can leave this open as a known issue.

rwxayheee commented 2 weeks ago

One advantage of treating the system as any other polymer/chorizo that is that we would be able to export a covalent/tethered docking on nucleic acids

You're right, I forgot about nucleic acids. I thought the current ligand preparation for tethered docking might need standard amino acid backbone. I really like the two-point method, but I don't know how much more work we are planning to do for this.

Vina currently can't have an empty ligand file so I couldn't find a way to run this kind of docking, while AD-GPU can do the calculation, but because all poses have a binding free energy of zero and I don't know if the poses are ranked correctly. But I'm interested and we can discuss more if we want to further improve this method.

Thanks again, I'm merging this for now