Poor diarization. - Githubissues

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

BSD 2-Clause "Simplified" License

3.58k stars 313 forks source link

Hello, I'm a newbie and just started using the program today. I somehow managed to set everything up, but Speaker Diarization isn't working very well. Firstly, the program doesn't recognize more than three speakers, and secondly, phrases from one speaker are often attributed to another. In other words, the voice separation is very poor. My audio is in Russian. Maybe I need to enable something or tweak some parameter to improve the result? Thanks for any advice.

My launch command: python diarize.py -a "D:\Temp2\97\31231.mp3" --whisper-model large-v3 --language ru --device cuda

MahmoudAshraf97 / whisper-diarization

Poor diarization. #254