How to apply collar metric on speaker diarization pipeline

pyannote / pyannote-metrics

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

http://pyannote.github.io/pyannote-metrics

MIT License

186 stars 33 forks source link

How to apply collar metric on speaker diarization pipeline #43

Closed couch-potahto closed 3 years ago

couch-potahto commented 4 years ago

In the READMe for the Speaker Diarization Pipeline, it is mentioned that the DER is brought down to 15.3% on a collar of 250ms. May I ask how I can input this collar metric to improve the DER on my pipeline?

Thank you!

hbredin commented 4 years ago

From pyannote.metrics documentation:

Because manual annotations cannot be precise at the audio sample level, it is common in speaker diarization research to remove from evaluation a 500ms collar around each speaker turn boundary (250ms before and after). Most of the metrics available in pyannote.metrics support a collar parameter, which defaults to 0.

Using a collar is only a more permissive way of evaluating the pipeline: the pipeline will stay exactly the same.

hbredin commented 4 years ago

That being said, you can evaluate the pipeline output with pyannote-metrics.py command line tool and compare DER with and without the --collar=0.5 option

pyannote-metrics.py diarization --subset=test --collar=0.5 AMI.SpeakerDiarization.MixHeadset /path/to/your/hypothesis.rttm

See pyannote.metrics documentation for more options, or the output of pyannote-metrics.py --help

hbredin commented 4 years ago

I am now going to transfer this issue to pyannote.metrics repo because this is metric issue, not a pipeline issue.