Vaibhavs10 / insanely-fast-whisper

Apache License 2.0
7.76k stars 547 forks source link

The speed is slower than fast-whisper, is something wrong in my config? #202

Open Green-li opened 8 months ago

Green-li commented 8 months ago

The model is used is Belle-2/Belle-whisper-large-v3-zh which is a finetuned model of openai/whisper-large-v3. Both whisper-v3 but with different weigths. When i use fast-whisper(fp16, bs=1) to transcribe a audio with 220.46s, cost 10.43s. when i use the cli to transcribe the same audio, it cost 15s. the cli like this:

CUDA_VISIBLE_DEVICES=0 pdm run insanely-fast-whisper --model-name Belle-2/Belle-whisper-large-v3-zh --file-name test.wav --flash True --batch-size 32 --device-id 0

The output: image And when i run the cli, the VRAM only cost 7GB, the utl of GPU is only 60%. Why?

silvacarl2 commented 7 months ago

i have seen the exact same thing. this technique appears to not be as fast as faster-whisper implementation. in fact, it seems to be about two times slower.

so i am not sure what i am missing?

we are testing it with this gist: https://gist.github.com/Vaibhavs10/16087d3c4dea59bdcba07ffbeee91272