bytedance / SALMONN

SALMONN: Speech Audio Language Music Open Neural Network
https://bytedance.github.io/SALMONN/
Apache License 2.0
978 stars 75 forks source link

inference cuda OOM on smaller GPU #41

Closed rrscholarship closed 4 months ago

rrscholarship commented 4 months ago

is there a way to run inference on 24GB GPU? A100-SXM-80GB is not accessible for now

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB (GPU 0; 23.65 GiB total capacity; 23.16 GiB already allocated; 34.31 MiB free; 23.17 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF