Closed couch-potahto closed 3 years ago
From pyannote.metrics documentation:
Because manual annotations cannot be precise at the audio sample level, it is common in speaker diarization research to remove from evaluation a 500ms collar around each speaker turn boundary (250ms before and after). Most of the metrics available in pyannote.metrics support a collar parameter, which defaults to 0.
Using a collar is only a more permissive way of evaluating the pipeline: the pipeline will stay exactly the same.
That being said, you can evaluate the pipeline output with pyannote-metrics.py
command line tool and compare DER with and without the --collar=0.5
option
pyannote-metrics.py diarization --subset=test --collar=0.5 AMI.SpeakerDiarization.MixHeadset /path/to/your/hypothesis.rttm
See pyannote.metrics
documentation for more options, or the output of pyannote-metrics.py --help
I am now going to transfer this issue to pyannote.metrics
repo because this is metric issue, not a pipeline issue.
In the READMe for the Speaker Diarization Pipeline, it is mentioned that the DER is brought down to 15.3% on a collar of 250ms. May I ask how I can input this collar metric to improve the DER on my pipeline?
Thank you!