Vaibhavs10 / insanely-fast-whisper

Apache License 2.0
7.65k stars 537 forks source link

please correct and/or update the readme comparing other whisper implementations #82

Closed BBC-Esq closed 11 months ago

BBC-Esq commented 11 months ago

Just FYI, I've messaged the folks over at faster-whisper, will for whisper.cpp as well as Jax. I'm contemplating which approach to use in my code so...

Along those lines, I'm putting in this "issue" to ask for revisions to the readme regarding comparing other whisper implementations. I'm hoping for a more apples-to-apples comparison.

For purposes of this issue, when I say "insanely-fast-whisper test" I'm referring to the test stating that it took 5 min 2 sec. And when I refer to "faster-whisper test" I'm referring to the test stating it took 9 min 23 sec. These two seem the most comparable.

Here are my requests regarding this issue:

Overall, I'm impressed and may move to using it...However, on my personal test using an RTX 4090, I specified a batch size of 1 using insanely-fast-whisper using bettertransformer and the Sam Altman audio took approximately 10 minutes (using fp16. And I tested faster-whisper using a beam size of 5 (since I have to research how to include a beam size of 1 parameter), and it took almost the exact same time. I fail to see how you're getting roughly half the time using insanely-fast-whisper...I haven't had a chance to test the Flash Attention 2 variety yet...

It's important to me, and I'm assuming others, before spending hours upon hours revising code, to have true comparisons. Hope my suggestion comes across alright. I'm truly interested in this technology and love the work that everyone is doing...Thanks!

Purfview commented 4 months ago

From the benchmarks on the main page it's obvious where it's faster, why it's faster and where it wouldn't be faster.

Blame yourself if you fall victim to your imagination.

talipturkmen commented 4 months ago

Hahaha chill man! Why is everybody is so aggressive in this thread? 🤣 I just pointed out that the claims of OP are right.