Closed mxpucci closed 1 year ago
You have to change it from the Whisper
instance
w = w.Whisper.from_pretrained("tiny")
w.params.language = "it"
w.transcribe(arr)
Well, I tried also doing that, however I get this error whisper_lang_id: unknown language 'ӄ'
In fact, after the language property is edited, accessing to w.params.language
gets UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 0: invalid continuation byte
I think there is a bug with the params c_str right now. Feel free to put up a PR to fix it. It is in src/whispercpp/api_export.cc
for the Params
obj.
Wish I could but I cannot... Hope someone else can fix this soon!
Describe the bug
Even if the original audio is not an English speech, the transcription is always translated in English. I've tried to change the
language
property ofparams
usingapi.Params.language = 'it'
but it didn't work.To reproduce
Expected behavior
No response
Environment
Python 3.9.6