OutOfMemoryError: CUDA out of memory

Carleslc commented 1 year ago

If you encounter this error in Google Colab, it means that your GPU has run out of RAM memory.

This usually happens with free accounts when using large models or long audio files.

Try the following solutions:

Runtime -> Restart Runtime to clear the session and then try your transcription again.
Try with smaller models (large models require about 10 GB and can fill the RAM in one single transcription). Try other models like medium (5 GB), small (2 GB) or base (1 GB).
Try to transcribe a shorter audio file. You can also split your long audio file into different shorter audio files with an external app (e.g. Audacity) and then transcribe those files one at a time.
Use API mode filling the api_key parameter with your OpenAI API Key. Note this has an associated cost. Splitting long audio files is automatic in this case.
Upgrade your Google Colab account to get a more powerful cloud GPU with more RAM memory.
You can try to install and run AudioToText locally.

mrazzari commented 1 year ago

When trying to transcribe multiple files, I also get out of memory when doing the second run (using the large model). As if the process isn't releasing memory when done.

Carleslc commented 1 year ago

When trying to transcribe multiple files, I also get out of memory when doing the second run (using the large model). As if the process isn't releasing memory when done.

Yes, using open-source large model in free Colab account can fill the RAM in one single transcription if audio file is large (usually about 30 minutes but depends on the audio file). You need to restart the runtime to clear the RAM for each file if that happens. Another option is to use the API filling the api_key parameter, which will use the OpenAI servers to process the audio files (faster and without filling your RAM, but with a pricing cost).

Maybe it would be possible to release memory via code, but for the moment it seems whisper model does some data caching that fills the RAM and it's not released after the transcription.

Carleslc / AudioToText

OutOfMemoryError: CUDA out of memory #1