ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
34.83k stars 3.55k forks source link

source material in multiple languages #749

Open MathiasSchindler opened 1 year ago

MathiasSchindler commented 1 year ago

Greetings,

I tried running whisper.cpp with a recording of a committee hearing at the European Parliament (which offer real time interpretation in EU languages as well as the original file without interpretation). Is there a way to tell whisper.cpp to allow for changes in spoken languages during the conversation? On some occasions, the same speaker might switch languages (the old and new presidents of the European Commission do that quite often) within the same speech and not just for some sprinkled in terms but for full paragraphs. I would appreciate any suggestions in this matter.

ggerganov commented 1 year ago

Try using the stream example with the --lang auto argument. Not sure if it will work, but might be worth giving at a try. In general, switching languages is not trivially supported

cjheath commented 1 year ago

I have also been trying this. It seems to be affected by whether or not it hears English first, or a foreign language. When it starts with English, sometimes it will ignore and sometimes say [Speaking in French]. With translation enabled, sometimes it will ignore foreign speech, and sometimes not translate it. And sometimes with translation enabled it ignores foreign language entirely, or just until it hears English, then processes it subsequently. (Testing using English and my very mediocre French, German and Italian accent)

I haven't yet worked out all the behaviour, but it definitely responds to the initial speech. It's a little frustrating.

Sternbach-Software commented 1 year ago

Code switching is currently an unsolved problem in AI...