I am testing the same audio file with the vanilla pipeline mentioned here whisper-large-v3-turbo and with the ggml model in whisper.cpp, and they give me very different results.
The original whisper-large-v3-turbo provides much more accurate results.
Is it because of the chunked mode VS sequential mode mentioned here?
I am testing the same audio file with the vanilla pipeline mentioned here whisper-large-v3-turbo and with the ggml model in whisper.cpp, and they give me very different results.
The original whisper-large-v3-turbo provides much more accurate results.
Is it because of the chunked mode VS sequential mode mentioned here?
Is there any way to change it in whisper.cpp?