Open minounou opened 6 months ago
It takes about 40G.
OK Thanks! also is "Nvidia 4xA10G large 96GB-vram" OK? I tried that before but still got "cuda out memory" error-message: it seems it needs a single card with more than 40G? (A100-40G of huggingface is not available this morning so I cannot start space on that, H100-80G is also not available yet):
File "/home/user/app/opensora/models/ae/videobase/modules/resnet_block.py", line 75, in forward
h = nonlinearity(h)
File "/home/user/app/opensora/models/ae/videobase/modules/ops.py", line 15, in nonlinearity
return x * torch.sigmoid(x)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.03 GiB. GPU 0 has a total capacty of 22.19 GiB of which 197.50 MiB is free. Process 256775 has 21.99 GiB memory in use. Of the allocated memory 19.30 GiB is allocated by PyTorch, and 2.38 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The support for splitting on multi-gpu is wip. https://github.com/huggingface/diffusers/pull/6396/ So it can't split models with device_map='auto' like transformers.
The support for splitting on multi-gpu is wip. huggingface/diffusers#6396 So it can't split models with device_map='auto' like transformers.
could you provice more detail information about multi GPU train or inference ?
Hi,
can I duplicate the huggingface-space and run on paid-GPU there (I tried on my local nvidia-4090(24GB) and got "cuda memory error)? Thanks! https://huggingface.co/spaces/LanguageBind/Open-Sora-Plan-v1.0.0