natrys / whisper.el

Speech-to-Text interface for Emacs using OpenAI's whisper model and whisper.cpp as inference engine.
140 stars 10 forks source link

fix: whisper.cpp allows for language detection #1

Closed munen closed 1 year ago

munen commented 1 year ago

Here's the upstream PR: https://github.com/ggerganov/whisper.cpp/pull/286

I've tested this successfully in my Emacs installation - primarily with German and English.

Thank you for providing whisper.el, it's very fun to use :pray:

natrys commented 1 year ago

This is really cool, thanks a lot. Simplified the condition a bit, and added a line in the readme:

https://github.com/natrys/whisper.el/commit/fd9fd202662c339c6a03282f19a7846094b11d09

munen commented 1 year ago

Simplified the condition a bit,

Totally reasonable change, my apologies for the redundant boolean algebra(;

and added a line in the readme:

Thanks for merging everything quickly! That made it possible to onboard the next user of whisper.el, already^^

natrys commented 1 year ago

I feel like there is a strong argument for auto to now just be the default. It works for English even for sub 30s clips without issue, so it has strictly more utility than current default en.

Only downside seems to be a bit of performance penalty. Which is probably totally a non-issue if audio is longer than say a minute. But if most people are using it with their ChatGPT plugin, I think in those context they are probably kinda latency sensitive and we shouldn't worsen it by default, idk.