ccsb-scripps / AutoDock-Vina

AutoDock Vina
http://vina.scripps.edu
Apache License 2.0
561 stars 199 forks source link

Issue with using ad4 ff with python #254

Open kmorisa opened 7 months ago

kmorisa commented 7 months ago

I am writing a python script and encountered with a following error: ... v = vina.Vina(sf_name='ad4', verbosity=1, seed=selectedSeed) ... v.set_receptor(rigid_pdbqt_filename=="xxx.pdbqt",flex_pdbqt_filename="yyy.pdbqt") ...

ERROR: Only flexible residues allowed with the AD4 scoring function. No (rigid) receptor.

From which file does this error arise, and what would be a troubleshooting process if exists?

Thanks, KM

diogomart commented 7 months ago

With AD4 scoring, vina expects autogrid maps to be loaded, but the PDBQT for the rigid part of the receptor should not be passed to v.set_receptor.

kmorisa commented 7 months ago

Thank you for your answer. Yes, it resolved the problem, and I got the error about the maps, as you mentioned. I have a line that creates a gpf4 file by incorporating prepare_gpf4.py. Is there any way to create maps from this using python?

Thanks, KM

diogomart commented 7 months ago

Maps are generated by autogrid, no python bindings https://autodock-vina.readthedocs.io/en/latest/docking_basic.html#optional-generating-affinity-maps-for-autodock-ff

kmorisa commented 7 months ago

I see. Thank you for your answers.

Thanks, KM

diogomart commented 7 months ago

You're welcome!

Dadiao-shuai commented 7 months ago

Hi, I also want to use ad4 scoring_function to score a dataset, (not dock) so is it possible for v = Vina(sf_name='ad4') just score?

I also wonder what other atom type can be written in 'generating-afinity-map-for-autodockff' because some complex have S or Mg.

Dadiao-shuai commented 7 months ago

Hi, I have figure out how to use pythonsh prepare_gpf.py to get gpf for 1a30_pocket.pdbqt and 1a30_ligand.pdbqt But when I use autogrid4 to get the fld and maps, it raises error:

 autogrid4 -p 1a30_pocket.gpf -l 1a30_pocket.glg

autogrid4: ERROR:  unknown ligand atom type CN
add parameters for it to the parameter library first!
rwxayheee commented 7 months ago

Hi @Dadiao-shuai,

not dock) so is it possible for v = Vina(sf_name='ad4') just score?

Yes, when you set v = Vina(sf_name='ad4'), you only initialize the Vina object but it still doesn't know what to do. You can tell it to perform different things including score, dock, optimize, and print or write outcomes.

List of these options and functions here: https://autodock-vina.readthedocs.io/en/latest/vina.html

Example of using score is included in the Python scripting tutorial: https://autodock-vina.readthedocs.io/en/latest/docking_python.html

I also wonder what other atom type can be written in 'generating-afinity-map-for-autodockff' because some complex have S or Mg.

This is beyond my knowledge. I am also a bit confused how S and SA are determined. I don't think my questions are relevant to the original post though.. Maybe we could open a new issue to discuss.

Edit: To your question I think YES for S and Mg, at present, we can generate affinity maps for S, SA, and Mg using Autogrid4.

autogrid4: ERROR: unknown ligand atom type CN add parameters for it to the parameter library first!

Do you have CN in line ligand_types of your gpf file? Is it a typo (missing space between C and N)? CN is not currently a valid autodock atom type.

Kerro-junior commented 7 months ago

the ligands are prepared with the adfr scripts, so I do not think it is a typo error. Here is another example:

autogrid4: ERROR: unknown ligand atom type AN add parameters for it to the parameter library first!

image

diogomart commented 7 months ago

@Dadiao-shuai @Kerro-junior AN and CN are not standard atom types, are they in the GPF or in the PDBQT of either ligand or receptor?

Kerro-junior commented 7 months ago

I tired the 1a30_ligand.sdf in CASF2016 coreset, it couldn't be transferred into pdbqt:

mk_prepare_ligand.py -i 1a30_ligand.sdf -o 1a30_ligand.pdbqt
[10:19:21] Explicit valence for atom # 6 C, 6, is greater than permitted
[10:19:21] ERROR: Could not sanitize molecule ending on line 102
[10:19:21] ERROR: Explicit valence for atom # 6 C, 6, is greater than permitted

But I didn't know what to do, I didn't find error in pymol: image 1a30_ligand.zip

Dadiao-shuai commented 7 months ago

I must say the CN atom is in the pocket,gpf file, it seems like not about the ligand? :

receptor 1a30_pocket.pdbqt           # macromolecule
gridfld 1a30_pocket.maps.fld         # grid_data_file
npts 40 40 40                        # num.grid points in xyz
spacing 0.375                        # spacing(A)
gridcenter 8.8395 25.2365 4.6875     # xyz-coordinates or auto
types CN                             # atom type names
smooth 0.5                           # store minimum energy w/in rad(A)
map 1a30_pocket.C.map                # atom-specific affinity map

I didn't find 'CN' in the ligand.pdbqt file.

@Kerro-junior I used meeko to prepare the ligand, but couldn't convert into pdbqt, even though I have tried obabel/rdkit to sanitize them(fail to sanitize):

[11:14:31] Explicit valence for atom # 25 C, 6, is greater than permitted
[11:14:31] ERROR: Could not sanitize molecule ending on line 133
[11:14:31] ERROR: Explicit valence for atom # 25 C, 6, is greater than permitted

So I used the adfr prepare_ligand4 and successfully gain the ligand.pdbqt

rwxayheee commented 7 months ago

Hi @Kerro-junior and @Dadiao-shuai,

I assume you two work together. Here's my suggestion:

  1. Problem with the ligand SDF file Whoever responsible for ligand preparation needs to take a closer look at the SDF file. The error message you got from meeko (RDKit actually) was cause by incorrect information in the atom blocks (charge) and the bond blocks. prepare_ligand4 could be one walkaround because it ignores these problems. To see how RDKit perceives the carboxylate groups in the ligand, read the molecule without sanitizing it and check the valence and charge, atom by atom:
    
    from rdkit import Chem
    from rdkit.Chem import rdmolfiles

input_sdf = "1a30_ligand.sdf"

m = rdmolfiles.MolFromMolFile(input_sdf, sanitize=False) for at in m.GetAtoms(): print(at.GetAtomicNum(), at.GetExplicitValence(), at.GetFormalCharge())



2. Problem with the contents of pocket.GPF

`autogrid4: ERROR: unknown ligand atom type CN`

This error message usually means you have this strange atom type in GPF file, which should be `C` and `N`. A space delimiter was missing, and other essential atom types (such as `OA` and `HD`, for that ligand you need those maps too) were also not present for some reason. Whoever of you responsible for the GPF file please check before using it for `autogrid4`. 

To let others reproduce this issue, please kindly provide the input files and command you did with `prepare_gpf.py`. 
Kerro-junior commented 7 months ago

There is nothing special about the usage of preparegpf.py to generate multiple gpf files: (the following script is compatible and run with python2.5)_

rwxayheee commented 7 months ago

@Kerro-junior Can you try AutoDockTools/Utilities24/prepare_gpf4.py instead? prepare_gpf.py in this repository is AutoDockTools/Utilities24/prepare_gpf4.py

The last line in AutoDockTools/Utilities24/prepare_gpf.py

#prepare_gpf.py -l indinavir.pdbq -r 1hsg.pdbqs -p spacing=0.4 -p npts=[60,60,60] -i ref.gpf -o testing.dpf 

suggests that the legacy code was likely intended for different input formats... This might explain the syntax difference and problem with parsing atom types in a PDBQT input

Dadiao-shuai commented 7 months ago

image

Thanks for your instruction, now I successfully generated all the gpf, Then I used autogrid4 and I get the following files, BUT some of them don't havefld/xyz/map files

rwxayheee commented 7 months ago

Hi @Dadiao-shuai, It's hard to know the cause without more information. You can take a look at the AutoGrid log file (.glg files).

Kerro-junior commented 7 months ago

We have prepared all the glg, fld and maps files that needed. But our script encounters the following error:

Reading AD4.2 maps ... done.
ERROR: Cannot compute Vina affinity maps using the AD4 scoring function.

image

This is the python env:

vina         1.2.2
python    3.7
numpy    1.20.3
Dadiao-shuai commented 7 months ago

to add more information for our @Kerro-junior job, this is the pythonvina1.2.2script:

            v.load_maps('1a30/1a30_pocket')
            v.set_ligand_from_file(lig_file)
            v.compute_vina_maps(center=center, box_size=[34,34,34])
            energy = v.score()

Maybe we have not generated the dpg file (as autodock4 need this)? image

diogomart commented 7 months ago

Don't compute the maps , just load them in. AD4 maps are calculated by autogrid, not by vina. Also, consider updating to the latest version which is 1.2.5

diogomart commented 7 months ago

According to the log file from autogrid (.glg), the .gpf instructed autogrid to compute maps for the following types

ligand_types A HD N NA               # ligand atom types
Dadiao-shuai commented 7 months ago

In addition, the 1bcu_pocket_rigid.gpf for 1bcu_pocket,pdbqt and 1bcu_ligand.pdbqt is :

npts 40 40 40                        # num.grid points in xyz
gridfld 1bcu_pocket.maps.fld         # grid_data_file
spacing 0.375                        # spacing(A)
receptor_types A C HD N OA SA        # receptor atom types
ligand_types A HD N NA               # ligand atom types
receptor 1bcu_pocket.pdbqt           # macromolecule
gridcenter auto                      # xyz-coordinates or auto
smooth 0.5                           # store minimum energy w/in rad(A)
map 1bcu_pocket.A.map                # atom-specific affinity map
map 1bcu_pocket.HD.map               # atom-specific affinity map
map 1bcu_pocket.N.map                # atom-specific affinity map
map 1bcu_pocket.NA.map               # atom-specific affinity map
elecmap 1bcu_pocket.e.map            # electrostatic potential map
dsolvmap 1bcu_pocket.d.map              # desolvation potential map
dielectric -0.1465                   # <0, AD4 distance-dep.diel;>0, constant

As the receptor has C atom, why autogrid4 could not produce C.map ? (I don't know if it's problematic to name the gpf as 1bcu_pocket_rigid.gpf)

diogomart commented 7 months ago

See the autodock4 manual for context: https://autodock.scripps.edu/wp-content/uploads/sites/56/2021/10/AutoDock4.2.6_UserGuide.pdf

diogomart commented 6 months ago

Such large energies typically occur when the ligand is outside the grid box. Vina v1.2.2 was not checking if the ligand is inside the search space when using v.score(). See https://github.com/ccsb-scripps/AutoDock-Vina/issues/112. I recommend upgrading to v1.2.5 because of this and other fixes.

diogomart commented 6 months ago

That's correct, the grid size is set in the gpf.