choderalab / yank

An open, extensible Python framework for GPU-accelerated alchemical free energy calculations.
http://getyank.org
MIT License
179 stars 70 forks source link

YANK on EC2 #433

Open drthock opened 8 years ago

drthock commented 8 years ago

Has YANK been tested on EC2 yet? I'm in the process of running the abl-imatinib-explicit example on a g2.2xlarge instance, and it looks like the 1 calculation will take ~4.5 days to complete. Have you seen anything on that time scale? On my personal 1080 gtx, the same calculation took ~10h.

Regarding the --mpi option, i'm confused about how it works with CPUs and GPUs. Is it possible to distribute a job across multiple CPUs while still also using the 1 GPU available? This is also for running on a g2.2xlarge. Eventually we would like to run on a g2.8xlarge which has 32 CPUs and 4 GPUs.

jchodera commented 8 years ago

Has YANK been tested on EC2 yet? I'm in the process of running the abl-imatinib-explicit example on a g2.2xlarge instance, and it looks like the 1 calculation will take ~4.5 days to complete. Have you seen anything on that time scale? On my personal 1080 gtx, the same calculation took ~10h.

We are planning to set up an optimized EC2 instance to make it easy to effectively use YANK on EC2, but haven't tackled this yet. I imagine the key is achieving robust parallelization across multiple nodes if this proves necessary.

Regarding the --mpi option, i'm confused about how it works with CPUs and GPUs. Is it possible to distribute a job across multiple CPUs while still also using the 1 GPU available? This is also for running on a g2.2xlarge. Eventually we would like to run on a g2.8xlarge which has 32 CPUs and 4 GPUs.

Currently, you should select one MPI process per GPU you wish to utilize. Heterogeneous CPU/GPU execution is not currently supported. The speedup of the GPU is so much greater than the CPU that utilizing the idle CPU threads will not give much of a speedup except for implicit solvent free energy calculations, where the remaining CPU threads could execute the ligand-in-solvent thermodynamic legs in parallel. This might be something we add support for after YANK 1.0.

Please remember that YANK is still experimental software! While we are in the process of finishing up the implementation of major features for YANK 1.0, we are only beginning the validation and optimization phase, so please treat the results and speed you get out of the current version of YANK as very much preliminary!

Lnaden commented 7 years ago

We'll look at other platform deployments post 1.0

1.0 touch