ccsb-scripps / AutoDock-GPU

AutoDock for GPUs and other accelerators
https://autodock.scripps.edu
GNU General Public License v2.0
402 stars 110 forks source link
autodock4 cuda gpu-computing molecular-docking multicore-cpu opencl

AutoDock-GPU: AutoDock for GPUs and other accelerators

About

Citation

Accelerating AutoDock4 with GPUs and Gradient-Based Local Search, J. Chem. Theory Comput. 2021, 10.1021/acs.jctc.0c01006

See more relevant papers

Features

Setup

Operating system CPU GPU
CentOS 6.7 & 6.8 / Ubuntu 14.04 & 16.04 Intel SDK for OpenCL 2017 OpenCL / CUDA >= 11
macOS Catalina 10.15.1 Apple / Intel Apple / Intel Iris, Radeon Vega 64, Radeon VII

Other environments or configurations likely work as well, but are untested. AutoDock-GPU since commit 846dc2b requires a C++17-capable compiler, which in practice means GCC >= 9. This also means the minimum version supported for Cuda-compilation is Cuda 11, however, since all versions of Cuda also come with OpenCL older versions can still be used using the OpenCL code path (DEVICE=OCLGPU).

Compilation

The first step is to set environmental variables GPU_INCLUDE_PATH and GPU_LIBRARY_PATH, as described here: https://github.com/ccsb-scripps/AutoDock-GPU/wiki/Guideline-for-users

make DEVICE=<TYPE> NUMWI=<NWI>
Parameters Description Values
<TYPE> Accelerator chosen CPU, GPU, CUDA, OCLGPU, OPENCL
<NWI> work-group/thread block size 1, 2, 4, 8, 16, 32, 64, 128, 256

When DEVICE=GPU is chosen, the Makefile will automatically tests if it can compile Cuda succesfully. To override, use DEVICE=CUDA or DEVICE=OCLGPU. The cpu target is only supported using OpenCL. Furthermore, an OpenMP-enabled overlapped pipeline (for setup and processing) can be compiled with OVERLAP=ON. Hints: The best work-group size depends on the GPU and workload. Try NUMWI=128 or NUMWI=64 for modern cards with the example workloads. On macOS, use NUMWI=1 for CPUs.

After successful compilation, the host binary autodock<type><N>wi is placed under bin.

Binary-name portion Description Values
<type> Accelerator chosen cpu, gpu
<N> work-group/thread block size 1, 2, 4, 8,16, 32, 64, 128, 256

Usage

Basic command

./bin/autodock_<type>_<N>wi \
--ffile <protein>.maps.fld \
--lfile <ligand>.pdbqt \
--nrun <nruns>
Mandatory options Description Value
--ffile -M Protein file <protein>.maps.fld
--lfile -L Ligand file <ligand>.pdbqt

Both options can alternatively be provided in the contents of the files specified with --filelist (-B) (see below for format) and --import_dpf (-I) (AD4 dpf file format).

Example

./bin/autodock_gpu_64wi \
--ffile ./input/1stp/derived/1stp_protein.maps.fld \
--lfile ./input/1stp/derived/1stp_ligand.pdbqt

By default the output log file is written in the current working folder. Examples of output logs can be found under examples/output.

Supported arguments

Argument Description Default value
INPUT
--lfile -L Ligand pdbqt file no default
--ffile -M Grid map files descriptor fld file no default
--flexres -F Flexible residue pdbqt file no default
--filelist -B Batch file no default
--import_dpf -I Import AD4-type dpf input file (only partial support) no default
--xraylfile -R reference ligand file for RMSD analysis ligand file
CONVERSION
--xml2dlg -X One (or many) AD-GPU xml file(s) to convert to dlg(s) no default
OUTPUT
--resnam -N Name for docking output log ligand basename
--contact_analysis -C Perform distance-based analysis (description below) 0 (no)
--xmloutput -x Specify if xml output format is wanted 1 (yes)
--dlgoutput -d Control if dlg output is created 1 (yes)
--dlg2stdout -2 Write dlg file output to stdout (if not OVERLAP=ON) 0 (no)
--rlige Print reference ligand energies 0 (no)
--gfpop Output all poses from all populations of each LGA run 0 (no)
--npdb # pose pdbqt files from populations of each LGA run 0
--gbest Output single best pose as pdbqt file 0 (no)
--clustering Output clustering analysis in dlg and/or xml file 1 (yes)
--hsym Handle symmetry in RMSD calc. 1 (yes)
--rmstol RMSD clustering tolerance 2 (Å)
SETUP
--devnum -D OpenCL/Cuda device number (counting starts at 1) 1
--loadxml -c Load initial population from xml results file no default
--seed -s Random number seeds (up to three comma-sep. integers) time, process id
SEARCH
--heuristics -H Ligand-based automatic search method and # evals 1 (yes)
--heurmax -E Asymptotic heuristics # evals limit (smooth limit) 12000000
--autostop -A Automatic stopping criterion based on convergence 1 (yes)
--asfreq -a AutoStop testing frequency (in # of generations) 5
--nrun -n # LGA runs 20
--nev -e # Score evaluations (max.) per LGA run 2500000
--ngen -g # Generations (max.) per LGA run 42000
--lsmet -l Local-search method ad (ADADELTA)
--lsit -i # Local-search iterations (max.) 300
--psize -p Population size 150
--mrat Mutation rate 2 (%)
--crat Crossover rate 80 (%)
--lsrat Local-search rate 100 (%)
--trat Tournament (selection) rate 60 (%)
--dmov Maximum LGA movement delta 6 (Å)
--dang Maximum LGA angle delta 90 (°)
--rholb Solis-Wets lower bound of rho parameter 0.01
--lsmov Solis-Wets movement delta 2 (Å)
--lsang Solis-Wets angle delta 75 (°)
--cslim Solis-Wets cons. success/failure limit to adjust rho 4
--stopstd AutoStop energy standard deviation tolerance 0.15 (kcal/mol)
--initswgens Initial # generations of Solis-Wets instead of -lsmet 0 (no)
SCORING
--derivtype -T Derivative atom types (e.g. C1,C2,C3=C/S4=S/H5=HD) no default
--modpair -P Modify vdW pair params (e.g. C1:S4,1.60,1.200,13,7) no default
--ubmod -u Unbound model: 0 (bound), 1 (extended), 2 (compact) 0 (same as bound)
--smooth Smoothing parameter for vdW interactions 0.5 (Å)
--elecmindist Min. electrostatic potential distance (w/ dpf: 0.5 Å) 0.01 (Å)
--modqp Use modified QASP from VirtualDrug or AD4 original 0 (no, use AD4)

Autostop is ON by default since v1.4. The collective distribution of scores among all LGA populations is tested for convergence every <asfreq> generations, and docking is stopped if the top-scored poses exhibit a small variance. This avoids wasting computation after the best docking solutions have been found. The heuristics set the number of evaluations at a generously large number. They are a function of the number of rotatable bonds. It prevents unreasonably long dockings in cases where autostop fails to detect convergence. In our experience --heuristics 1 and --autostop 1 allow sufficient score evaluations for searching the energy landscape accurately. For molecules with many rotatable bonds (e.g. about 15 or more) it may be advisable to increase --heurmax.

When the heuristics is used and --nev <max evals> is provided as a command line argument it provides the (hard) upper # of evals limit to the value the heuristics suggests. Conversely, --heurmax is the rolling-off type asymptotic limit to the heuristic's # of evals formula and should only be changed with caution. The batch file is a text file containing the parameters to --ffile, --lfile, and --resnam each on an individual line. It is possible to only use one line to specify the Protein grid map file which means it will be used for all ligands. Here is an example:

./receptor1.maps.fld
./ligand1.pdbqt
Ligand 1
./receptor2.maps.fld
./ligand2.pdbqt
Ligand 2
./receptor3.maps.fld
./ligand3.pdbqt
Ligand 3

When the distance-based analysis is used (--contact_analysis 1 or --contact_analysis <R_cutoff>,<H_cutoff>,<V_cutoff>), the ligand poses of a given run (either after a docking run or even when --xml2dlg <xml file(s)> is used) are analyzed in terms of their individual atom distances to the target protein with individual cutoffs for:

The contact analysis results for each pose are output in dlg lines starting with ANALYSIS: and/or in <contact_analysis> blocks in xml file output.

Documentation

Visit the project Wiki.

AutoDock-GPU requires Meeko for preparing the receptor and ligands, and autogrid for calculating the affinity grid maps, including the file ending in .maps.fld that is passed to option -M or --ffile.

Visit the Meeko documentation for more information and tutorials covering AutoDock-GPU usage.

Contributing