MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.28k stars 272 forks source link

Article describing whisper-diarization to other diarization methods #31

Closed AlexandderGorodetski closed 1 year ago

AlexandderGorodetski commented 1 year ago

Guys,

Do you have an article describing whisper-diarization to other diarization methods.

Thanks a lot, AlexG.

MahmoudAshraf97 commented 1 year ago

Hi, I'll try to compare pyannote diarization to nemo diarization as they are the best two available AFAIK

smxsm commented 1 year ago

Whisper with PyAnnotate works pretty good, too ... I've played around with this: https://towardsdatascience.com/unlock-the-power-of-audio-data-advanced-transcription-and-diarization-with-whisper-whisperx-and-ed9424307281 It only splitted some speakers into too many segments ...

I've also tried Vosk which can also distinguish between speakers (see https://github.com/alphacep/vosk-api/blob/master/python/example/test_speaker.py) But the speech recognition itself was really bad compared to whisper.