savbell / whisper-writer

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.
GNU General Public License v3.0
244 stars 40 forks source link

Latency/Using another backend #11

Closed glinkot closed 6 months ago

glinkot commented 11 months ago

As mentioned in another thread, the UI of this tool is great for its purpose, but the latency is significant. Tests done on my RTX4090 laptop (Legion 7i pro from this year):

small.en 1 word: 4.98s (0.2 words/sec) 7 word sentence: 7.43s (0.94 words/sec) 53 words, 2 sentences: 21.3s (2.49 words/s)

large-v2 7 word sentence: 16.84s (0.42 words/sec) 52 words, 2 sentences: 31.2s (1.67 words/s)

If we could instead call one of the c++ based ports (https://github.com/Const-me/Whisper or https://github.com/guillaumekln/faster-whisper) this could be significantly reduced. I tested the first of those by recording the same speech to a file and transcribing to a text file: 1 word = .687s 5 words: 0.844s 53 words: 1.485s (on second run - it took 4.8s on first run presumably to warm something up)

savbell commented 6 months ago

Hi there,

Thanks for your comment! I've recently returned to this project and just migrated over to use faster-whisper. This should speed things up significantly! :)