Open eirrgang opened 1 year ago
We plan to move gpus_per_rank
to a float value, so that for example 3 ranks could share a GPU. It is not yet fully clear how we will realize backend support for this though - only LSF supports that natively at the moment. So it might take a while until this becomes useful for tasks on non-LSF resources.
We plan to move
gpus_per_rank
to a float value, so that for example 3 ranks could share a GPU. It is not yet fully clear how we will realize backend support for this though - only LSF supports that natively at the moment. So it might take a while until this becomes useful for tasks on non-LSF resources.
Ah. I hadn't thought about the degree of or type of sharing. There may need to be an additional attribute or level of granularity. A distinction may need to be made about which rank "owns" the GPU in the non-LSF case.
I don't think sharing a device between multiple processes is necessary for the feature to be useful. I think there is sufficient application code in the wild already that pins device usage to specific ranks or to rank 0. (I expect a lot of software evolves its multiprocessor and multi-GPU acceleration code paths somewhat independently.)
E.g. a single-node simulation with 1 GPU and MPI-based parallelism for the non-GPU part of the work load
gpus_per_rank
is an integer value that assumes a job will use a clean multiple of GPUs more than ranks.This does not fit well when there is a strange ratio of GPUs to cores, or when an application can best use compute resources in terms of several processes per GPU.
For example, GROMACS MD simulations can distribute tasks for heterogeneous compute hardware, and find optimal CPU allocation at a relatively small number of OpenMP threads per MPI rank. For instance, I might have 1, 2, or 4 jobs splitting 128 cores and 4 GPUs, with fewer than 32 OpenMP threads per rank.
Can GPUs be reserved independently from numbers of cores?