ccsb-scripps / AutoDock-Vina

AutoDock Vina
http://vina.scripps.edu
Apache License 2.0
561 stars 199 forks source link

Python bindings vina v.s vina executable #295

Closed rytakahashi closed 3 months ago

rytakahashi commented 3 months ago

When I tried to reproduce PoseBusters results with vina #31. With executable vina (1.2.5), I was not able to reproduce more than 50% success rates of poses within 2Å RMSD. As following the protocols in the paper

vina --receptor receptor(+cofactors if there are).pdbqt --ligand_start_conf.pdbqt --config ligand_conf.txt --seed 123 --num_modes 40 --exhaustiveness 32 

Results what I got were less than 30%

Then the following Meeko docking protocol, when I use python binding one (vina: 1,2,5), I was able to recover the paper results: more than 50% success rates of poses within 2Å RMSD. In the python binding code, I did not find an option of extending box size.

So, first question is that in the python binding one, does not have extending option of the box size?

The second question is that executable vina (1.2.5) and python binding vina should be the exactly the same? But, from my docking simulations, python binding version is much more accurate than the executable one. I really don't know where so much differences came from.

Thanks,

rwxayheee commented 3 months ago

Hi @rytakahashi

In the python binding code, I did not find an option of extending box size. So, first question is that in the python binding one, does not have extending option of the box size?

I don't understand. Can you show which option and how to use it?

Seems like you got better results with the Meeko protocol. I'm genuinely curious. Did you notice any differences in the ligand input files (if you generated them using obabel as part of the executable protocol, according to your other post)? Are all the other parameters the same?

rytakahashi commented 3 months ago

Hi, from Meeko protocol is

v.compute_vina_maps(center=[centroid.x, centroid.y, centroid.z], box_size=[25, 25, 25])

while vina execution, in the ligand_conf.txt, e.g.

center_x = 16.606000
center_y = 16.541501
center_z = 41.621000

size_x = 64.606000
size_y = 73.331000
size_z = 66.803997

and the neeko protocol ligand_start_conf.pdbq, the coordinate of the ligands and [centroid.x, centroid.y, centroid.z] have to be matched, right? While the latter case (vina), ligand_start_conf.pdbq does not need to be the same with [center_x, center_y, center_z], right? vina will translate the ligands' coordinate to ligand_conf.txt center. My question was Meeko protocol does not have this options?

And, yes, hundreds of vina docking experiments, Meeko protocol looks much better, but why? I thought it was just python wrapper code?

rwxayheee commented 3 months ago

hi @rytakahashi

and the neeko protocol ligand_start_conf.pdbq, the coordinate of the ligands and [centroid.x, centroid.y, centroid.z] have to be matched, right?

No

vina will translate the ligands' coordinate to ligand_conf.txt center. My question was Meeko protocol does not have this options?

Global search will be done anyway within the specified box (the center is not the initial center of ligand, but the center of search space, if my understanding is correct), unless you only do local optimization (--local_only or vina.optimize).

And, yes, hundreds of vina docking experiments, Meeko protocol looks much better, but why?

I think maybe we are not comparing apples to apples, but definitely try to cross-validate with the same inputs, keeping the same docking parameters, and let us know what you find :>

rytakahashi commented 3 months ago

Many thanks your comments, I did one experiment that I set up the same the centroid of the ligand and the center of the box for both meeko and vina. But, vina was still under-performed. But, since I didn't set up local optimization correctly, maybe as your comments, I am not comparing apples with apples. I will check it again, thanks.

diogomart commented 3 months ago

Did you also set the same box size? Larger boxes likely perform worse.

diogomart commented 3 months ago

centroid of the ligand and the center of the box

There is no option for centroid of the ligand. There is only the box center.

rytakahashi commented 3 months ago

Thanks for all your comments, First,

and the neeko protocol ligand_start_conf.pdbq, the coordinate of the ligands and [centroid.x, centroid.y, centroid.z] have to be matched, right?

No

This came from vina.optimize() for local minimization. Since PoseBuster's start ligands have been optimized, when I applied vina.optimize(), I thought it may require a larger box size.

As I must compare apple with apply the following should be:

vina execution

vina --receptor receptor.pdbqt --ligand ligand_start_conf.pdbqt --out vina.pdbqt  --config ligand_conf.txt --cpu 8 --seed 123 --num_modes 40 --exhaustiveness 32

while python binding code is

lig = Chem.SDMolSupplier('ligand_conf.sdf', removeHs=False)[0]
centroid = ComputeCentroid(lig.GetConformer())

# centroid is required for docking box definition.
print(centroid.x, centroid.y, centroid.z)

# main part of docking study.
# Used for loop and run dock with each ligand, it'll take few sec per lingad.
v = Vina(sf_name='vina', cpu=8)
v.set_receptor('receptor.pdbqt')
v.set_ligand_from_file('ligand_start_conf.pdbqt')
v.compute_vina_maps(center=[centroid.x, centroid.y, centroid.z], box_size=[25., 25., 25.])
v.dock(exhaustiveness=32, n_poses=40)
v.write_poses(f'vina.pdbqt', n_poses=10, overwrite=True)

I think the above both should be the same (ligand_conf.txt and v.comute_vna_map has the same information).

However, the results of posebusters RSMD < 2 A: True of False. The executable results show True (25%) False (75%), while python binding one, True (50%>), False (>50%).

I am still puzzling why it is so different results. If someone points me out what is the difference between the command line and the code, I am really appreciated it.

Many thanks,

diogomart commented 3 months ago

Can you send the input files?

rytakahashi commented 3 months ago

inputs.zip pdbqt files were created by the PoseBuster's protocol in S1. Thanks,

diogomart commented 3 months ago

I don't have this:

centroid = ComputeCentroid(lig.GetConformer())
diogomart commented 3 months ago

Nevermind, I saw your conf file and you still have different box sizes

size_x = 54.606000
size_y = 63.331000
size_z = 56.803997

vs 25 in python script

rytakahashi commented 3 months ago

Ah,

centroid = ComputeCentroid(lig.GetConformer()) 

is calculated from 1G9V_RQ3_ligand.sdf.

size_x = 54.606000
size_y = 63.331000
size_z = 56.803997

these were calculated by pymol, ... yes, my code has a bug (mistake). I will repeat my calculations, then probably, I will close this issue. Many thanks for pointing me out.

rytakahashi commented 3 months ago

Yes, I now see, vina (executable) ~ python binding. However, I bit am supervised by how box size affects for vina's results, At the same time, I was looking at gnina results as well, it does affect the results but success rates increased about less than 10%. Anyhow, many thanks, again.