Vaibhavs10 / insanely-fast-whisper

Apache License 2.0
7.5k stars 529 forks source link

Cuda Out of Memory #162

Open hatimkh20 opened 8 months ago

hatimkh20 commented 8 months ago

CUDA out of memory. Tried to allocate 60.00 MiB. GPU 0 has a total capacty of 11.00 GiB of which 4.48 GiB is free. Of the allocated memory 5.19 GiB is allocated by PyTorch, and 283.84 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I am getting this error though from the error message i am allocating 60mb where I am having 4.4gb is free so why it is creating the issue?

Also i am using batch size = 16 and chunk lenght = 30

image

Any help would be appreciated.

PrashantDixit0 commented 8 months ago

I was also facing the same issue, But then I changed command from insanely-fast-whisper --file-name <filename or URL> to insanely-fast-whisper --model-name distil-whisper/large-v2 --file-name <filename or URL>

It took some time to transcribe, It worked :+1:

Utopiah commented 8 months ago

Thanks, what's the trade-off using --model-name distil-whisper/large-v2 vs the default model?

PrashantDixit0 commented 8 months ago

Default model is Whisper-large-v3 which has 1550M parameters and Distill-whisper/large-v2 model has 756M

because of which distil-whisper/large-v2 use less GPU Vram comparitively.

I hope I answered your query :+1:

Utopiah commented 8 months ago

Thanks, it uses less GPU VRAM (which allows it to run on long audio segments even on smaller cards) but it's also less accurate due to the lower number of parameters?

PrashantDixit0 commented 8 months ago

Whisper-Large-v3 should work better than v2 theoretically and it works also but one guy mentioned in OpenAI forum that whisper-large-v2 worked better than v3 for multiple iterations

https://community.openai.com/t/whisper-large-v3-model-vs-large-v2-model/535279

iSuslov commented 7 months ago

Having the same issue when using --timestamp word

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB. GPU 0 has a total capacity of 21.96 GiB of which 896.00 KiB is free. Including non-PyTorch memory, this process has 21.95 GiB memory in use. Of the allocated memory 21.14 GiB is allocated by PyTorch, and 591.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Happens for batch sizes > 4. Tried on L4 and T4 GPUs