Is there a way to specify the batch size to reduce VRAM?

jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

MIT License

1.59k stars 176 forks source link

Batch transcription is not supported on the original Whisper models (i.e. batch size is always 1) so there is no batch size parameter to control for reducing memory usage.

However, you can reduce memory usage on the Hugging Face models by specifying batch_size because the default, batch_size=24, uses significantly more memory than the original models. https://github.com/jianfch/stable-ts/blob/6d066308ed5a3328a69006d3a7d4496315736c0f/stable_whisper/whisper_word_level/hf_whisper.py#L186

The best way to reduce memory usage is to use a distilled or/and quantized large model from Faster-Whisper or Hugging Face.

jianfch / stable-ts

Is there a way to specify the batch size to reduce VRAM? #398