Fixes to export covalent (two-point attached) docking results

rwxayheee commented 3 weeks ago

The PR has two fixes to export covalent (two-point attached) docking results:

20a6a86 The order of flexible sidechain information in the BEGIN_RES remark is different, when generated from mk_prepare_ligand.py with CovalentBuilder:

BEGIN_RES A HIS 309

fe976df+56f7f31 Adds a checks to mk_export.py and prints warning when sdf_string is empty. If SDF is the only file asked to generate, exits and print a suggestion to use -k to retain flexres

rwxayheee commented 3 weeks ago

Processing covalent (tethered) docking as a special type of flexible docking might need a little bit more work. I will look at the export codes. Converting this to draft

rwxayheee commented 3 weeks ago

I will keep working on this because it might be a useful structure to have, when we want to modify the chemical identity a chorizo residue (like merging with a covalent ligand).

But we might want to put a note for people who's doing the covalent docking, that the poses are not clustered in AutoDock-GPU (not quite sure about ranking, because total free energy of binding will always be 0) and many arguments (cluster_lead, etc.) will not work as expected. If we don't want to support it anyone, I will drop it from the Colab notebook

rwxayheee commented 3 weeks ago

06fd1ce + 903287f

are the minimal fix I can think of to add covalent flexible residue as a special case in export_pdb_updated_flexres.

It's a little complicated and maybe more than a special case for an export function, because it requires modification of residues in polymer (a LinkedRDKitChorizo object). The modification could have been a separate function, and not to be repeated in each iteration.

The other problem I had was that pdbqt_mol (a PDBQTMolecule instance) has an __iter__ method and the constructor function RDKitMolCreate.from_pdbqt_mol always iterates through it.

To address both problems, I used deepcopy just to avoid changing the original objects. It's very inefficient since export_pdb_updated_flexres is inside the loop that iterates poses to export. Maybe what I wanted to do was implemented on the wrong level, not in the right place.

There's a remaining issue: mk_export.py option --all_dlg_poses currently only affects the SDF writing, not the poses to write to PDB. From a quick look I don't have a simple fix for that, because PDBQTMolecule doesn't seem to have an equivalent option likeonly_cluster_leads.

@diogomart please let me know what you think, but there's no rush to merge. We can leave this open as a known issue.

rwxayheee commented 2 weeks ago

One advantage of treating the system as any other polymer/chorizo that is that we would be able to export a covalent/tethered docking on nucleic acids

You're right, I forgot about nucleic acids. I thought the current ligand preparation for tethered docking might need standard amino acid backbone. I really like the two-point method, but I don't know how much more work we are planning to do for this.

Vina currently can't have an empty ligand file so I couldn't find a way to run this kind of docking, while AD-GPU can do the calculation, but because all poses have a binding free energy of zero and I don't know if the poses are ranked correctly. But I'm interested and we can discuss more if we want to further improve this method.

Thanks again, I'm merging this for now

forlilab / Meeko

Fixes to export covalent (two-point attached) docking results #219