ccsb-scripps / AutoDock-Vina

AutoDock Vina
http://vina.scripps.edu
Apache License 2.0
562 stars 199 forks source link

[python] Autodock Vina not always sorting scores correctly #211

Closed Sjb4243 closed 1 year ago

Sjb4243 commented 1 year ago

Hello,

I installed Vina using pip install vina I prepared my receptor PDBQT file using REDUCE to add hydrogens I prepared my ligand PDBQT file using meeko's mk_prepare_ligand.py script

I am running docking from a Python script, the docking itself works fine, however the reported binding energies are not always sorted correctly. See as follows:

Performing docking (random seed: -904527756) ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
   1       -8.028          0          0
   2       -7.857      2.931      3.972
   3       -7.715      2.479      3.562
   4       -7.682      2.658      8.519
   5        -8.21      3.189      4.967

I would expect mode 5 to be at the top of the results.

I have attached the files used, and the following code was used to generate the docking;

v.set_receptor("5u1v_merged.pdbqt")
v.set_ligand_from_file("5u1w_7RY.pdbqt")
v.compute_vina_maps(center=[-17.8626, -4.41595, -15.0678], box_size=[13.89, 12.61, 13.09])
v.dock(exhaustiveness=8, n_poses=5)

docking.zip

diogomart commented 1 year ago

I can reproduce. Thanks. I'll look into this.

Sjb4243 commented 1 year ago

Great, if you would like I can also provide all of my code as well as all of the files I was using at the time. Out of around 25 runs with different files I estimate it happens around 8/9 times.

diogomart commented 1 year ago

I introduced this bug in v1.2.4 with c1217c1ac7f5cfdd5931eba01f187dd967969c16.

Fixed in #214

At the end of docking there is a local search using direct interactions with receptor atoms (instead of interpolating from grid maps). After this local search, which is called refinement, it was possible to have movable atoms just outside the grid box because of a positive tolerance margin of 0.0001 Angstrom set in non_cache.h. The slope (i.e. out-of-box penalty) used during refinement is 100/angstrom in the first refinement iteration. This slope leads to a penalty of 0.01 for an atom that is 0.0001 Angstroms outside the grid box. In v1.2.3 poses were sorted using this tiny penalty, but then re-scored with a slope equal to 1,000,000/angstrom, leading to UNBOUND energies that can reach 100 kcal/mol. To prevent this in v1.2.4 the margin was set to negative 0.0001 Ansgtrom in non_cache.h, but that created the problem reported here because poses that are inside the grid box but with less than 0.0001 Angstrom from a box face are given the penalty, sorted to the last position, and finally re-scored without the penalty. In your example, this is what happened to your 5th pose. It was given the huge penalty because one of the atoms was within 0.0001 Angstrom from a grid box face (but inside the box), sorted to last position, but then happened to have the best score when it was re-scored without the penalty. #214 fixes this by not giving the huge penalty anymore, it gives the same penalty that is given in the final rescoring (slope = 1e6).

diogomart commented 1 year ago

@Sjb4243 in my tests (see #214) the order of the poses was incorrect in 13% of the cases (25% with flexible sidechains). I think your grid box may be a little too small for that ligand, which may explain why you observe incorrect pose order in about 30% (8/25) of the cases.

Sjb4243 commented 1 year ago

@diogomart Thank you for explaining! I tried looking through the code myself and would have never have guessed that was the issue, I'll make my box bigger in future!

diogomart commented 1 year ago

@Sjb4243 many thanks for reporting so quickly after the release!