Closed terryops closed 3 months ago
I've been getting the same error with different languages (including English actually). Any help is much appreciated
same error :(
I got the same error when using this https://github.com/ggerganov/whisper.cpp Do you get the same results using the official Whisper from OpenAI? I decided to revert back to that because got too inaccurate results using anything other than that.
When Ive played with whisper large model, different languages works as expected. When I use WhisperX or other implementation os speed ones - I have issues with languages different than english.
I've tested this project with English(default model) and it worked as expected, but when I run the same audio with Large model, I encountered
RuntimeError: Calculated padded input size per channel: (1). Kernel size: (2). Kernel size can't be greater than actual input size
error. But if I switched to another audio in Chinese(using large-V2), it goes without error, but with so much weird words repetitions in the output. Translating the Chinese below, it's akin to:Speaker 1: Please don't don't hesitate to like ke and subscribe scribe
Update: Tested on Japanese and get the same result as well. Tested on French and works well just like English.