colindaven / guppy_on_slurm

Splitting and accelerating the Oxford Nanopore basecaller guppy using CPU with the SLURM job scheduler
MIT License
15 stars 5 forks source link

GPU optimization #1

Open colindaven opened 3 years ago

colindaven commented 3 years ago

From @maxdeest

runners chunks_per_runner chunk_size ~= 100000 [max GPU memory in GB] 2

For Ampere A100 GPU, max out the 40 GB GPU mem --chunk_size 3000 --gpu_runners_per_device 8 --chunks_per_runner 512

BeneKenobi commented 1 year ago

Why do you use less gpu_runners_per_device in https://github.com/colindaven/guppy_on_slurm/blob/d373459d1e32aa995e4ddff25eff08f74d3a500b/runbatch_gpu_guppy.sh#L42 ? Where there problems with more runners?

colindaven commented 1 year ago

I don't think there was a problem with more runners, it was maybe just set for a different GPU. You need to play with the settings to reach the optimal speed for your GPU, and not everyone has a A100 with 40 GB GPU RAM.

Just monitor the GPU device usage and RAM using nvidia-smi.

Remember IO has a big impact, so storing fast5 files on SSD or other fast storage can decrease runtime a lot (36h to 24h for a human genome).