-
Whisper has been incredibly good at transcription and subtitles, but the diarization feature, using https://huggingface.co/pyannote/speaker-diarization-3.0, looks like it isn't really reliable yet. I'…
-
### Tested versions
Tested version speaker-diarization-3.1
### System information
window 11, pyannote 3.3.2, python 3.8.10
### Issue description
I am using the speaker-diarization pipeline and I …
-
Hi,
I'm not so much into the details of whisper or whisper.cpp and I don't know if it is currently even possible with the foundation, but it would be nice if speakers could be marked or speaker-cha…
-
Hello NeMo Team,
I’m just a student highly inspired by your work on speaker diarization. I’m particularly excited about Sortformer and its novel use of Sort Loss to address the permutation problem in…
-
Maybe this is not an issue but a design decision but when choosing --diarize, it seems that only output on the screen is being diarized, while output files do not contain the "(Speaker X)" prefix.
-
WhisperX diarization is done with Pyannote .
I'm using whisper-X for transcription in closed environment, no internet access.
It works well with whisper transcription , since we can download the…
-
I was running NeMo on a 1 hour wav file, with stemming turned on(demcs). whisper and alignment runs fine but when it enters diarization, I encounter the below error.
The same file runs fine end-to-e…
-
Dsnote is great for STT by using whisper. For audio samples with different persons speaking, e.g. podcasts, movies …, one ends up with a messy text because Whisper doesn’t do what’s called ‘speaker di…
-
Intel MKL allows users with Intel CPU to run transcription and other AI models faster
/bounty 100
context: https://github.com/mediar-ai/screenpipe/issues?q=mkl
MKL was hard to setup with wi…
-
It was clearly possible on the Azure AI services playground (Real time - Language identification + Speaker diarization). But I can't figure out how to implement it.