As mentioned in another thread, the UI of this tool is great for its purpose, but the latency is significant. Tests done on my RTX4090 laptop (Legion 7i pro from this year):
If we could instead call one of the c++ based ports (https://github.com/Const-me/Whisper or https://github.com/guillaumekln/faster-whisper) this could be significantly reduced. I tested the first of those by recording the same speech to a file and transcribing to a text file:
1 word = .687s
5 words: 0.844s
53 words: 1.485s (on second run - it took 4.8s on first run presumably to warm something up)
Thanks for your comment! I've recently returned to this project and just migrated over to use faster-whisper. This should speed things up significantly! :)
As mentioned in another thread, the UI of this tool is great for its purpose, but the latency is significant. Tests done on my RTX4090 laptop (Legion 7i pro from this year):
small.en 1 word: 4.98s (0.2 words/sec) 7 word sentence: 7.43s (0.94 words/sec) 53 words, 2 sentences: 21.3s (2.49 words/s)
large-v2 7 word sentence: 16.84s (0.42 words/sec) 52 words, 2 sentences: 31.2s (1.67 words/s)
If we could instead call one of the c++ based ports (https://github.com/Const-me/Whisper or https://github.com/guillaumekln/faster-whisper) this could be significantly reduced. I tested the first of those by recording the same speech to a file and transcribing to a text file: 1 word = .687s 5 words: 0.844s 53 words: 1.485s (on second run - it took 4.8s on first run presumably to warm something up)