Faster Whisper does not use GPU

chidiwilliams / buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

https://chidiwilliams.github.io/buzz

MIT License

12.11k stars 911 forks source link

Faster Whisper does not use GPU #910

Open xpk20040228 opened 4 days ago

xpk20040228 commented 4 days ago

using a 4060 8G with CUDA 12.6.65, on windows installed with pip pytorch version is "pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124" it says in powershell log "[warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead." and there's no load on my 4060. Weirdly normal Whisper in all different model sizes can use the GPU and does not show the above warning line.

raivisdejus commented 3 days ago

@xpk20040228 See if a bit older version or torch works

pip3 unstall torch torchaudio  
pip3 install torch==2.2.1+cu121 torchaudio==2.2.1+cu121 --index-url https://download.pytorch.org/whl/cu121

Assuming that you have torch 2.2.4 installed now

raivisdejus commented 2 days ago

@xpk20040228 Seems current versions of Faster whisper can't use GPU on Windows as https://developer.nvidia.com/cublas is not available for Windows for CUDA 12 at this time.

The somewhat good part of this is that with GPU there is not a big difference in speed among Faster whisper and regular Whisper or Huggingface (transformer) whisper.

Thanks for pointing this issue out. Sad that there is nothing much we can do at this time besides waiting for Nvidia to release cuBLAS for Windows.