OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB (GPU 0; 79.35 GiB total capacity; 73.95 GiB already allocated; 12.19 MiB free; 76.80 GiB reserved in total by PyTorch) If
reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB (GPU 0; 79.35 GiB total capacity; 73.95 GiB already allocated; 12.19 MiB free; 76.80 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
GPUs: 2*A100(80G) DeepSpeed: ZeRO3