ccsb-scripps / AutoDock-Vina

AutoDock Vina
http://vina.scripps.edu
Apache License 2.0
596 stars 209 forks source link

Autogrid4 Atom Type Error with Large Ligand Database #297

Open R-Stefano opened 6 months ago

R-Stefano commented 6 months ago

While attempting virtual screening with the Enamine DDS-10 ligand database (~50k compounds), I encountered an error during the grid map file generation using autogrid4. The error message is the following autogrid4: ERROR: unknown ligand atom type CG0 add parameters for it to the parameter library first!

Suggesting the addition of this atom type to the AD4.1_bound.dat file. Considering the vast number of unique atom types I have in the .gbf file, manual updates seem impractical.

Is my approach for screening this large ligand set correct, and is there an automated solution for updating the AD4.1_bound.dat file to recognize all unique atom types?

Here's more details on the approach I'm using: I had performed the usual preparation steps and now I'm generating the grid map files by first running python3 prepare_gpf4.py -l ./experiments/ligands/active_ligand.pdbqt -r ./experiments/5FNQ.pdbqt -y -d ./experiments/ligands

and then running autogrid4 ./experiments/5FNQ.gpf -l ./experiments/5FNQ.glg

However autogrid4 is failing with the following error: autogrid4: ERROR: unknown ligand atom type CG0 add parameters for it to the parameter library first!

rwxayheee commented 6 months ago

Hi @R-Stefano CG0 is a glue atom and the original atom type is C. No additional maps are needed. One of the easiest walkarounds could be excluding these lines in the generated gpf file grep -v -e "CG0" -e "CG1" old.gpf > new.gpf

R-Stefano commented 6 months ago

Hi @rwxayheee I see. However I had hundreds more of glue atoms so was not feasible. I ended up removing the -d parameter and instead passing -p ligand_types="H,HD,HS,C,A,N,NA,NS,OA,OS,F,Mg,MG,P,SA,S,Cl,CL,Ca,CA,Mn,MN,Fe,FE,Zn,ZN,Br,BR,I,Z,G,GA,J,Q" where the ligands types are the atoms I have in the AD4.1_bound.dat file.

Given the extensive variety of atom types, this seemed the most feasible approach. Would this method potentially lead to inaccuracies or misconfigurations in the docking process?

I'm looking for advice to ensure the integrity of the docking process..

rwxayheee commented 6 months ago

Hey @R-Stefano Do you actually need all of these atom types? Can you check if you are able to generate affinity types for them all? There might be a hard limit of 14 map types for autogrid4 to handle in one run, see: https://github.com/forlilab/Meeko/issues/87#issuecomment-1963097489 You can still do this (in two steps perhaps). I don't think having more maps can lead to inaccuracies in docking process

R-Stefano commented 6 months ago

Hi @rwxayheee

Do I need them? I don't know. But with 50k ligands and later 100k, 1M etc.. most likely. So I thought, I just include all the ones supported by auto dock by default and that's it.

Docking is running with the param I have set, so looks like that I can use 14+ atoms. Sometimes I have to run autogrid twice to actually generate the maps but not sure if it has to do with the limit or with the python subprocess I'm using.

However as pointed out in the issue you shared with me, I'm also wondering if I can remove the "duplicated" atoms such as Cl and CL from the list or not. Do you have any insights into this?

rwxayheee commented 6 months ago

Hi @R-Stefano

Docking is running with the param I have set, so looks like that I can use 14+ atoms

I was under the impression that autogrid4 cannot generate more than 14+ atom types in one go. Can you check if you have all the affinity maps, when you do -p ligand_types="H,HD,HS,C,A,N,NA,NS,OA,OS,F,Mg,MG,P,SA,S,Cl,CL,Ca,CA,Mn,MN,Fe,FE,Zn,ZN,Br,BR,I,Z,G,GA,J,Q"? For example, does autogrid4 generate the Br or BR map?

If not, you can still prepare all possible maps just in case some ligands need them. Prepare several gpf files, 1st GPF file with -p ligand_types="H,HD,HS,C,A,N,NA,NS,OA,OS,F" Then the 2nd GPF file with -p ligand_types="Mg,MG,P,SA,S,Cl,CL,Ca,CA,Mn,MN" Finally the 3rd GPF file with -p ligand_types="Fe,FE,Zn,ZN,Br,BR,I,Z,G,GA,J,Q" (does autogrid4 work for the last four? I don't know..) Running autogrid4 with the three gpf files, individually, can get you all possible atom types. Then you can reuse these maps for all docking calculations. Some atom types are indeed duplicated and others are rare (at least not used by Meeko any more? see here for a full list of possible ligand atom types that can be generated from Meeko ligand preparation protocol)

Lastly is your docking using the Vina or AutoDock4 scoring function? If your scoring function is Vina, you don't need to generate affinity maps, as they are computed internally independent of autogrid4

R-Stefano commented 6 months ago

HI @rwxayheee yes I do 36 maps + fld and xyz as you can see here

Screenshot 2024-03-30 at 22 35 18

I'm using ad4 and soon adgpu