ccsb-scripps / AutoDock-Vina

AutoDock Vina
http://vina.scripps.edu
Apache License 2.0
596 stars 209 forks source link

Clarification on hydrophobic interaction contribution #285

Closed toby-aqemia closed 6 months ago

toby-aqemia commented 7 months ago

I am experiencing some unexpected behaviour with the Vina scoring function, in particular the hydrophobic interaction term. From what I understand the score is calculated as the following sum: $$c=\sum{i < j} f{t_i tj}(r{ij})$$ And the hydrophobic term is always non-negative (but then weighted to become negative). For this reason I expect that adding atoms should always increase (or leave unchanged) the hydrophobic interaction term. However this is not what I observe. With the following example I find the interaction term for a molecule to be 1.452 while the interaction term for a subset of atoms is 1.766. Note that I extract the subset by deleting lines in the pdbqt file so the atom types don't change (although the molecule becomes unrealistic). I thought I could be observing some difference due to intra-molecule interactions but from what I understand these aren't included in the lig_inter part of the score output.

Could it be to do with how I've built my toy example with unrealistic molecules and proteins? Or have I fundamentally misunderstood some aspect of how the scoring function is built? I would much appreciate any help to see what I am missing.

Code to reproduce

from pathlib import Path

from vina import Vina

HERE = Path(__file__).parent

# files provided below
mol_path = HERE / "mol.pdbqt"
subset_of_mol_path = HERE / "subset_of_mol.pdbqt"
protein_path = HERE / "protein_fragment.pdbqt"

scorer = Vina(sf_name="vina", verbosity=0, no_refine=False)
scorer.set_receptor(str(protein_path))

# keep only hydrophobic term
weight = (0, 0, 0, 1, 0, 0, 0)
scorer.set_weights(weight)

# compute maps
scorer.compute_vina_maps(
    [-4.44247372, 3.58252631, 14.61826309],
    [11.95399952, 15.97199988, 14.76799965],
)

for mol_path in [mol_path, subset_of_mol_path]:
    scorer.set_ligand_from_file(str(mol_path))

    score = scorer.score()
    print(f"Unweighted hydrophobic interaction term for {mol_path.name}: {score[1]}")

protein_fragment.pdbqt

ATOM     39  CZ  TYR A   7       0.217  -5.062  13.014  1.00  0.00     0.131 A 
ATOM     40  OH  TYR A   7      -0.670  -4.109  13.459  1.00  0.00    -0.190 OA
ATOM    750  ND1 HIS A  96      -4.805  -4.973  14.161  1.00  0.00    -0.227 NA
ATOM    752  CE1 HIS A  96      -4.859  -3.698  14.442  1.00  0.00     0.199 A 
ATOM   1550  CG2 THR A 199      -1.424  -1.061  14.615  1.00  0.00     0.038 C 
ATOM   1898  O   ASN A 243      -2.883  -6.788  13.243  1.00  0.00    -0.268 OA

mol.pdbqt

REMARK  Name = 
REMARK                            x       y       z     vdW  Elec       q    Type
REMARK                         _______ _______ _______ _____ _____    ______ ____
ROOT
ATOM      1  S   UNL     1      -6.312   3.596  15.766  0.00  0.00    +0.000 S 
ATOM      2  C   UNL     1      -5.391   2.199  16.010  0.00  0.00    +0.000 A 
ATOM      3  C   UNL     1      -4.122   2.251  15.429  0.00  0.00    +0.000 A 
ATOM      4  C   UNL     1      -3.943   3.498  14.755  0.00  0.00    +0.000 A 
ATOM      5  C   UNL     1      -2.720   3.866  13.998  0.00  0.00    +0.000 C 
ATOM      6  C   UNL     1      -2.676   5.387  13.749  0.00  0.00    +0.000 C 
ATOM      7  C   UNL     1      -3.910   5.973  13.106  0.00  0.00    +0.000 C 
ATOM      8  S   UNL     1      -5.172   5.969  14.361  0.00  0.00    +0.000 S 
ATOM      9  C   UNL     1      -5.089   4.322  14.860  0.00  0.00    +0.000 A 
ATOM     10  C   UNL     1      -3.617   7.415  12.654  0.00  0.00    +0.000 C 
ATOM     11  O   UNL     1      -6.440   6.188  13.719  0.00  0.00    +0.000 OA
ATOM     12  O   UNL     1      -4.808   6.760  15.507  0.00  0.00    +0.000 OA
ENDROOT
BRANCH   2  13
ATOM     13  S   UNL     1      -6.025   0.889  16.944  0.00  0.00    +0.000 S 
ATOM     14  O   UNL     1      -7.450   0.942  16.779  0.00  0.00    +0.000 OA
ATOM     15  O   UNL     1      -5.492   0.980  18.275  0.00  0.00    +0.000 OA
ATOM     16  N   UNL     1      -5.472  -0.571  16.240  0.00  0.00    +0.000 NA
ENDBRANCH   2  13
BRANCH   5  17
ATOM     17  N   UNL     1      -2.741   3.129  12.722  0.00  0.00    +0.000 NA
BRANCH  17  18
ATOM     18  C   UNL     1      -1.473   3.172  11.982  0.00  0.00    +0.000 C 
ATOM     19  C   UNL     1      -1.554   2.103  10.891  0.00  0.00    +0.000 C 
ENDBRANCH  17  18
ENDBRANCH   5  17
TORSDOF 0

subset_of_mol.pdbqt made up of atoms 4, 5, 18, and 19 of the molecule above.

REMARK  Name = 
REMARK                            x       y       z     vdW  Elec       q    Type
REMARK                         _______ _______ _______ _____ _____    ______ ____
ROOT
ATOM      1  C   UNL     1      -4.122   2.251  15.429  0.00  0.00    +0.000 A 
ATOM      2  C   UNL     1      -3.943   3.498  14.755  0.00  0.00    +0.000 A 
ATOM      3  C   UNL     1      -1.473   3.172  11.982  0.00  0.00    +0.000 C 
ATOM      4  C   UNL     1      -1.554   2.103  10.891  0.00  0.00    +0.000 C 
ENDROOT
TORSDOF 0
diogomart commented 7 months ago

Vina uses its own types internally, and carbons that are bound to some heteroatoms are not considered hydrophobic. So, the system with four atoms has four hydrophobic carbons, but in the original molecule atom 18 (atom 3 in the four-atom system) is not hydrophobic because it is bonded to nitrogen.

toby-aqemia commented 6 months ago

Ah I missed that, I thought it was all done upstream of the pdbqt file. Thank you very much for your help!