Closed MANLP-suda closed 1 year ago
As you said, OOM problem occurs frequently in single GPU environment.
In this situation, you can use the gpus
argument. It allows you to use multiple GPUs in parallel.
The argument gpus
is a list type, and you can put multiple GPU numbers in the list.
If you are in a multi-GPU environment, this option will solve the problem by using VRAM space simultaneously.
CUDA out of memory. Tried to allocate 12.00 MiB (GPU 0; 39.45 GiB total capacity; 37.87 GiB already allocated; 10.25 MiB free; 38.30 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I find it always easy to exceed the VRAM. How do you solve this problem