ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.97k stars 3.67k forks source link

whisper.android: change language and model on-the-fly #2595

Open amalgame21 opened 2 days ago

amalgame21 commented 2 days ago

I read this https://github.com/ggerganov/whisper.cpp/issues/1099 so that I can configure the language in build time. https://github.com/ggerganov/whisper.cpp/blob/021eef1000b0a84cc08575aac3352116c72e8187/examples/whisper.android/lib/src/main/jni/whisper/jni.c#L179

I am using large-v3 model because only this model have my prefer language done properly. However I want to toggle "en" and "" (auto) and my prefer language on the fly, and I want to loa different model on-the-fly (because for english it is much faster in base model), so that I no longer need to compile multiple apk for different purpose.

How can I do that? Thanks!

mrfragger commented 2 days ago

Split your audio into chapters or chunks and encode the English ones then the other language ones..then just combine back all the subs into one subtitle.

amalgame21 commented 2 days ago

Split your audio into chapters or chunks and encode the English ones then the other language ones..then just combine back all the subs into one subtitle.

  1. I want it to be a VTT text input method. In this case I have to set the language into "yue", and use Large v3 model, because it is the only module support "yue".
  2. I want it to be a translation tool while traveling to other place like Japan. In this case I have to set the language into "jp" or "en" when I am talking to japanese people and set the language to "en" or "zh" or "yue" when the Japanese people are talking. In this case, I might use a lightweight module to increase the speed.

So I have to compile at least 2 different apk to achieve both use case.