madgraph5 / madgraph4gpu

GPU development for the Madgraph5_aMC@NLO event generator software package
29 stars 33 forks source link

Support for multi-GPU nodes? #836

Open valassi opened 3 months ago

valassi commented 3 months ago

Just a reminder, we should do something about nodes with multiple GPUs.

This was for instance asked by Jin for CMS, as he found that one of our condor nodes has 4 GPUs (b9g57n8656.cern.ch).

Presently we simply do cudaSetDevice(0) and hipSetDevice(0) (actually both as gpuSetDevice(0). In any case for the moment the code is meant to only use one GPU at a time, so one option is to keep the code the same, but then have the CUDA/etc specific env variables to select one single GPU as the visible GPU (I believe this is CUDA_VISIBLE_DEVICES, while instead NVIDIA_VISIBLE_DEVICES is for visibility inside docker containers?)

Some interesting reads on multi-GPU