set_per_process_memory_fraction(fraction, device=None): Set memory fraction for a process. The fraction is used to limit an caching allocator to allocated memory on a CUDA device. The allowed value equals the total visible memory multiplied fraction. If trying to allocate more than the allowed value in a process, will raise an out of memory error in allocator.
Proposal
This setting takes a percentage [0-1] and a device (optional). Use an environment variable alongside ENABLE_CUDA of the format CUDA_MEMORY_FRACTION where the value is 0.0-1.0 and passed to fraction. Additionally, if set, check and prefer CUDA_MEMORY_FRACTION_... variable(s), where the value is the same format, and the ... is passed to device for each variable found.
Questions
[ ] Is there a better name than CUDA_MEMORY_FRACTION/CUDA_MEMORY_FRACTION_...?
One use case is for AWS vGPU support so that multiple consumers of the vGPU device(s) don't assume they have exclusive rights to the full resource usage.
Summary
PyTorch allows a limit for GPU memory. This is useful, for example, when a GPU resource is shared.
Proposal
This setting takes a percentage [0-1] and a device (optional). Use an environment variable alongside
ENABLE_CUDA
of the formatCUDA_MEMORY_FRACTION
where the value is 0.0-1.0 and passed tofraction
. Additionally, if set, check and preferCUDA_MEMORY_FRACTION_...
variable(s), where the value is the same format, and the...
is passed todevice
for each variable found.Questions
CUDA_MEMORY_FRACTION
/CUDA_MEMORY_FRACTION_...
?