ccsb-scripps / AutoDock-Vina

AutoDock Vina
http://vina.scripps.edu
Apache License 2.0
562 stars 199 forks source link

Different values for affinity in vina1.2.3 and vina1.2.5 depending on the use of --rigid_macrocycles foe some molecules #218

Closed xavgit closed 1 year ago

xavgit commented 1 year ago

Dear all, I've docked 126 molecules with vina-1.2.3 and vina-1.2.5 to the same receptor,docking box,seed and other parameters in the config file. Each of the ligands has been mk_prepared with and without the --rigid_macrocycles option. All the returned values are the same, for both the vina versions and ligand preparation methods, except for ten ligands for which there are different values depending on the vina version and the presence of the --rigid_macrocycles option in the call of mk_prepare_ligand.py.

The following tables report the different values and behaviours for the best Affinity or mode 1.

Legenda: no rm --> mk_prepare_ligand.py without --rigid_macrocycles rm --> mk_prepare_ligand.py with --rigid_macrocycles

vina-1.2.5 lig_id no rm rm diff ZINC000170910319 -4.238 -5.21 0.972 ZINC000096249626 -5.078 -6.396 1.318 ZINC000019015189 -4.25 -6.764 2.514 ZINC000019015192 -4.228 -6.908 2.68 ZINC000170910317 -4.321 -5.884 1.563 ZINC000096249625 -4.919 -6.168 1.249 ZINC000096249575 -5.077 -6.24 1.163 ZINC000096196827 -4.353 -6.235 1.882 ZINC000096196855 -4.682 -6.389 1.707 ZINC000096196884 -4.955 -6.612 1.657 ZINC000096196843 -5.025 -6.052 1.027 The reported values are always lower when the --rigid_macrocycles option is used.

vina-1 .2.3 lig_id no rm rm diff ZINC000170910319 -9.094 -5.203 3.891 ZINC000096249626 -10.56 -6.374 4.186 ZINC000019015189 -14.12 -6.782 7.338 ZINC000019015192 -14.54 -6.674 7.866 ZINC000170910317 -9.43 -5.822 3.608 ZINC000096249625 -10.21 -6.66 3.55 ZINC000096249575 -10.32 -6.142 4.178 ZINC000096196827 -9.521 -6.614 2.907 ZINC000096196855 -10.44 -6.585 3.855 ZINC000096196884 -10.67 -6.623 4.047 ZINC000096196843 -10.05 -5.904 4.146 The reported values are always lower when the --rigid_macrocycles option is not used.

The molecules for which there are no difference in the values do not have atom types in the set G or CG whereas the ligands presenting two different values all have atom types in the previous set.

Also the processing times are different. I have reported some specific tests only for ZINC000019015192 in the file make_test_verb_2.sh_complete.log.

Where I'm wrong?

Thanks.

Saverio

make_test_verb_2.sh_complete.log

diogomart commented 1 year ago

Differences are expected because v1.2.4 fixed a bug that systematically gave macrocycles a lower but incorrect energy. See https://github.com/ccsb-scripps/AutoDock-Vina/issues/200#issuecomment-1590327079

xavgit commented 1 year ago

Ok. Now I'm using vina 1.2.5 ,as you can see, but there are differences with this version too. For example for the molecule ZINC000019015192 vina-1.2.5 returns -4.228 when --rigid_macrocycles option is not used whereas -6.908 when the previous option is used with a difference of 2.68 Kcal/mol. Which of the two ligand preparation methods, with or without --rigid_macrocycles option, it needs to use if they return different values for some molecules. I should choose a method for present the docking results. Thanks again. Saverio.

diogomart commented 1 year ago

The entropy loss upon binding is estimated based on the number of rotatable bonds. For ZINC000019015192 the number of rotatable bonds goes up from 2 to 13, and the penalty associated with entropy loss increases by about 2 kcal/mol. Such penalty is unreasonably large for ZINC000019015192 as it is fairly rigid. So the scores can't really be trusted much when macrocycles are made flexible, but the poses are potentially much better. To be fair, docking scores shouldn't be trusted much in general, anyway, but here it's a fairly systematic error.

I think this is not as much of a problem in autodock-gpu because it reads the TORSDOF field (if I'm not mistaken) while vina counts the number of BRANCH keywords. TORSDOF is still 2 even if this macrocycle is made flexible. Perhaps we should have Vina read the TORSDOF field as well... @atillack @sforli what about a third release in a week?!

atillack commented 1 year ago

@diogomart I am not the biggest fan of reading the TORSDOF field as this just moves the problem to preparation - and in this case in a way only works accidentally as it wasn't adjusted upon making the macrocycles flexible. I think a better, slightly more general approach would be to exclude the macrocycle torsions automatically (or only count them fractionally as some DOFs will remain upon closure).

diogomart commented 1 year ago

The fractional TORSDOF is probably a good idea. It's not by accident that TORSDOF is still 2, this was a deliberate decision in meeko to avoid overpenalizing macrocycles. Having preparation set TORSDOF allow us to keep TORSDOF constant independently of the number of bonds that are made rotatable.

atillack commented 1 year ago

Sorry, I know Meeko does "the right thing" - I am just not liking the dependence on a field in an ancient format that may also come from other sources. At the same time, PDBQT is the correct input format for Vina and so we may as well use it's features. Maybe we can go a little bit more general for future things of course 😉

diogomart commented 1 year ago

@xavgit we decided to not fix this. The number of rotatable bonds is a very crude model for the entropy loss upon binding, and our assessment is that there are no good options. As it is, macrocycles are over-penalized because each rotatable bond contributes to the entropy loss. If Vina reads the TORSDOF from Meeko, which excludes rotatable bonds within macrocyclic rings, then they would be under penalized. The fractional TORSDOF that Andreas suggested is a good compromise but would require some work to figure out the magnitudes of fractional TORSDOF contributions. Instead of fixing anything, we will just document this limitation #219.

Thank you @xavgit for reporting this so thoroughly.