ccsb-scripps / AutoDock-GPU

AutoDock for GPUs and other accelerators
https://ccsb.scripps.edu/autodock
GNU General Public License v2.0
366 stars 101 forks source link

Maximum number of --nruns #232

Closed Mishakolok closed 4 months ago

Mishakolok commented 1 year ago

Hello, It seems like the maximum available number of nruns is limited to 1000. Is there a particular reason behind this? I think that with modern GPUs increasing this number shouldn't seriously impact performance (for example, on my RTX 4070 a docking simulation with 1000 runs and 90 x 90 x 90 A takes only several seconds.

diogomart commented 1 year ago

I'm curious, for your system, using the default number of runs:

  1. do you get comparable results to 1000 runs
  2. how much faster is it than with 1000 runs
Mishakolok commented 1 year ago

My system consists of Human Serum Albumin as a receptor (PDB 1N5U) and a Porhyrin derivative as a ligand (for example TmPyP4 https://atb.uq.edu.au/molecule.py?molid=1271234). My goal was to try to identify potential binding sites via blind docking, so that they could be verified with our experimental data. The time of the run with default settings (nrun=20):

Job #1 took 1.797 sec after waiting 2.507 sec for setup

(Thread 3 is processing Job #1)
Run time of entire job set (1 file): 4.381 sec
Processing time: 0.076 sec

All jobs ran without errors.

And with nruns=1000

Job #1 took 2.751 sec after waiting 2.522 sec for setup

(Thread 9 is processing Job #1)
Run time of entire job set (1 file): 9.408 sec
Processing time: 4.131 sec

All jobs ran without errors.

All the other settings were set to default (except --npdb set to 1). The best conformations look quite similar (green for 20 runs and red for 1000 runs). However, the energy of best poses for 20 runs is not stable, varying between -5.3 and -5.7 kcal/mol between runs launched with the same command. The same applies to 1000 runs scenario. image Increasing the number of populations to 2048 (maximum) doesn't really improve variablity for 20 runs, while the situation for 1000 runs becomes more stable (between -5.6 and -5.7 kcal/mol). I tested this by running both cases about 5 times, so this might be wrong. With the population of 2048 timings for 20 runs look like this:

Job #1 took 1.728 sec after waiting 2.511 sec for setup

(Thread 5 is processing Job #1)
Run time of entire job set (1 file): 5.591 sec
Processing time: 1.351 sec

And for 1000 runs

Job #1 took 23.167 sec after waiting 2.498 sec for setup

(Thread 1 is processing Job #1)
Run time of entire job set (1 file): 93.131 sec
Processing time: 67.428 sec

Obviously, it's significantly longer the the default case, but for single dockings might be passable.

I hope this information will be of use to you. I am a bit new to the docking field, so I might have missed some important aspects of it.

diogomart commented 1 year ago

Interesting! Thanks for sharing :+1:

@atillack seems like a good case for allowing larger nrun. What do you think?

atillack commented 1 year ago

@Mishakolok @diogomart We can probably increase that number a little bit. There's one little wrinkle in the Cuda code allocating memory based on the maximum number of runs I want to take care of first - I'll PR when ready :-)

atillack commented 1 year ago

@Mishakolok @diogomart PR #233 is up.