Closed srilekha1993 closed 3 months ago
@rwxayheee can you please help me out for above issue
Hi @srilekha1993 I don't fully understand the issue, can you explain what you want me to help with?
Were you trying to run the benchmark calculations in AD-GPU_set_of_42? Are you referring to ligand_properties.csv
Can you share more details like the files and your commands? If your docked pose is 30 Å off from the reference, maybe you were docking on a different binding site or your reference ligand structure doesn't align with the receptor coordinates. Some visualization might help
@rwxayheee Thanks for your response.
Here are the complete details of my run: 1) Installation of Autodock GPU- I have followed the github repo https://github.com/ccsb-scripps/AutoDock-GPU README and installed with NUMWI=64 setting.
2) Dataset used for experiment are downloaded from https://zenodo.org/records/4031961 . These are the 140 protein-ligand complexes mentioned in the Autodock-GPU paper (https://pubs.acs.org/doi/10.1021/acs.jctc.0c01006). We run experiments on these complexes without making any changes to the dataset.
3) I ran autodock-gpu for 5wlo dataset using the following command(protein.maps.fld and rand-0.pdbqt files are present in the 5wlo directory) /home/ubuntu/AutoDock-GPU/bin/autodock_gpu_64wi --ffile ./protein.maps.fld --lfile ./rand-0.pdbqt --nrun 20
4) I get the following output on the screen where the command is run
From the above screenshot, we can see that the energy value is -20.12 kcal/mol. This appears to be approximately close to the energy value mentioned in ligand_properties.csv file that comes with the dataset. In the ligand_properties.csv, the energy score for 5wlo is -20.45 which is close to -20.12 that i have obtained from my run.
5) We get the following RMSD table in the output file(rand-0.dlg) from our run
RMSD value for 5wlo in ligand_properties.csv is 1.55 A. As we can see from the above table the reference RMSD is ~30 A. We are not sure why the reference RMSD is so high for our docked pose when we didnot change any setting in the dataset. It would be great to have some clarification on this.
I am attaching the complete rand-0.dlg file for your reference. rand-0.txt
Thanks
Hi @srilekha1993, thanks for the detailed description.
I ran autodock-gpu for 5wlo dataset using the following command(protein.maps.fld and rand-0.pdbqt files are present in the 5wlo directory) /home/ubuntu/AutoDock-GPU/bin/autodock_gpu_64wi --ffile ./protein.maps.fld --lfile ./rand-0.pdbqt --nrun 20
The ligand input file rand-0.pdbqt
doesn't align with the receptor coordinates. Please see a picture:
Therefore, if the input coordinate of ligand was used to compute the Referece RMSD (since no alternate reference was provided?) it will produce you a large RMSD even if the ligand was docked as expected.
Have you tried to visualize your output pose and compare to an aligned crystal structure?
When you run docking you could also specify a reference ligand input, according to README it's by option --xraylfile
Hi @rwxayheee , thank you very much for your response. I am working with @srilekha1993 and I would like some basic clarifications since we are new to using Autodock:
--xraylfile
option. Can you also tell us where we might find the x-ray reference ligand for the 5wlo dataset (and perhaps all the other receptors in the 140 complexes dataset)?Thank you very much for your help on this.
Regards, Manasi
Hi @manasi-t24,
The devs in this repository or authors of this work might be able to give you better answers. I will try my best from a user perspective:
As we used the rand-0.pdbqt file in the downloaded dataset without making any changes, we weren't aware that the input ligand file has to be aligned with the receptor coordinates if the input coordinate of the ligand is being used as a reference. Is there a general way in which we can make that change to the rand-0.pdbqt file (not just for this dataset but for all the 140 complexes )?
We don't need the input to be near the designated binding site (could be in arbitrary position). Also rand-0.pdbqt
seems like a random conformer of the ligand. It's ok to use it as an input file, but to get a meaningful reference RMSD we will need to provide a correct reference that aligns with the receptor coordinates
We have not tried to visualize the output and have not compared it to an aligned crystal structure. Would you be able to suggest a visualization tool that is generally used so that we can take a look at it?
I use PyMOL in the screenshot showed above. It's very easy to learn, programmable with python integration, supports many common chemical structure files and can make nice looking picture :)
Thanks for pointing out the way to provide the reference ligand input using the --xraylfile option. Can you also tell us where we might find the x-ray reference ligand for the 5wlo dataset (and perhaps all the other receptors in the 140 complexes dataset)?
I checked a few structures and I think the ligand files named flex-xray.pdbqt
are generated from ligands in their original positions as crystal structures. Sometimes, alternate locations were assigned in the crystal structures and flex-xray.pdbqt
corresponds to one of them. You can double-check with the authors.
If you wish to generate reference files on your own, a possible procedure could be:
Download PDB file from a PDB server and extract ligand coordinates
Obtain the Smiles string of the ligand from a chemical component library
Use rdkit.Chem.AllChem.AssignBondOrdersFromTemplate
to assign bond orders
As long as all heavy atoms are present in the crystal structure, this function allows you to repair the ligand and turn it into a RDKit molecule with valid bond information which can then be used in Meeko for PDBQT file generation
i confirm that files named rand-
are random conformers in random locations meant to be used as input for docking, and flex-xray
have the x-ray positions from the PDB. Thank you @rwxayheee for your response.
Thank you @rwxayheee and @diogomart . We were able to get the desired results.
Regards, Manasi
Hi, The ligand_properties.csv shows RMSD_of_probable_global_minimum as 1.55 for 5wlo protein. Below attachment shows the output docking table with one of the ligand for 5wlo protein
So can anyone help me out how to evaluate the RMSD value from above output
Thanks