Latency/Using another backend

As mentioned in another thread, the UI of this tool is great for its purpose, but the latency is significant. Tests done on my RTX4090 laptop (Legion 7i pro from this year):

small.en 1 word: 4.98s (0.2 words/sec) 7 word sentence: 7.43s (0.94 words/sec) 53 words, 2 sentences: 21.3s (2.49 words/s)

large-v2 7 word sentence: 16.84s (0.42 words/sec) 52 words, 2 sentences: 31.2s (1.67 words/s)

If we could instead call one of the c++ based ports (https://github.com/Const-me/Whisper or https://github.com/guillaumekln/faster-whisper) this could be significantly reduced. I tested the first of those by recording the same speech to a file and transcribing to a text file: 1 word = .687s 5 words: 0.844s 53 words: 1.485s (on second run - it took 4.8s on first run presumably to warm something up)

savbell / whisper-writer

Latency/Using another backend #11