Closed jojasadventure closed 2 months ago
Update: After further testing, I've discovered that the language selection issue appears to be model-dependent.
I found that the Systran/faster-distil-whisper-large-v3
model consistently produces English transcriptions regardless of the language parameter.
However using the Systran/faster-whisper-medium
model with language=de
, the API correctly transcribes German audio. Haven't tried other models yet.
tl;dr it seems the language selection functionality is implemented on the server side but may not be working as expected with all models.
Update: After further testing, I've discovered that the language selection issue appears to be model-dependent. I found that the
Systran/faster-distil-whisper-large-v3
model consistently produces English transcriptions regardless of the language parameter. However using theSystran/faster-whisper-medium
model withlanguage=de
, the API correctly transcribes German audio. Haven't tried other models yet.tl;dr it seems the language selection functionality is implemented on the server side but may not be working as expected with all models.
Yeah, distil models only support English.
From the README.md
...
language:
- en
...
Hi, first of all, thank you so much for building this. I managed to create a little Python app to trigger from my Mac with a hotkey and replace macOS dictation for myself [https://github.com/jojasadventure/whisper-client]
Having a bit of a hard time attempting to implement a language switch as it would be awesome to be able to transcribe in different languages. Whatever I try, I get back a sort-of-translation instead of a transcription though.
I have noticed that passing a different language parameter does not seem to convince the transcribe endpoint to transcribe in that language. I tried troubleshooting that by doing a file in the Webui but this doesn't have a language selector.
I have attempted to fix this by also passing the parameter for task="transcribe", as that seems to be the suggested fix on many forums. The create() method does not support this parameter though (maybe I'm confused).
I even see the server, in transcribe_file.py has some code to deal with the task parameter, but I'm too much of a noob to figure out or even convince Claude to figure out how to pass it. Or does the code below even mean it is automatically set to transcribe? In an ideal world of course the transcription endpoint should never return translations as there is a translation endpoint, so that would make sense ...
Would you be willing to point me in the right direction, or at least confirm if that's even implemented / something that should work in principle? Thank you!