collabora / WhisperFusion

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
1.55k stars 111 forks source link

Other languages and Whisper models #20

Open fuglu opened 10 months ago

fuglu commented 10 months ago

Hi and thanks for sharing this awesome project! 🤩

Currently it seems that only english is supported/configured but we would also like to try other languages (e.g. german) as well.

So we started with Whisper. We briefly tried using the Whisper small model instead of small.en by simply patching build-whisper.sh and rebuilding the Docker container but that doesn't seem to be the only place we have to touch here as we only get this when running the container:

INFO:root:[Whisper INFO:] New client connected

INFO:root:[Whisper INFO]: . br,pt whe int Mus............................................, eos: True
INFO:root:[Whisper INFO]: Average inference time 0.37747994336214935

INFO:root:[Whisper INFO]: .. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br. br., eos: True
INFO:root:[Whisper INFO]: Average inference time 0.31598156690597534

Before we dig deeper into the project (we just found it today), we thought we'd quickly ask if you might have any tips/recommendations for us or are already working on similar ideas.

Thanks again!

zoq commented 10 months ago

Hello, thanks for the interest in the project. For the transcription part make sure to also pass the right language here:

https://github.com/collabora/WhisperFusion/blob/1de4c740954848883f911e6c97e1db105b999b82/examples/chatbot/html/js/main.js#L146

de for german.

Also, make sure to use Mistral, since phi-2 has limited support for german. Also, right now WhisperSpeech supports Polish and English only, we are working on a German version, so the output might sound a little bit strange.