Issue understanding the outputs of coverage and purity metrics

First, thank you for this open-source project!

I start looking at speaker change detection algorithms and discovered this open-source project. As I am a newbie in this field, I am still struggling understanding which measure to use to evaluate a speaker change detection module. The coverage and purity measure are well explained in this page https://pyannote.github.io/pyannote-metrics/reference.html

I had a look at a previous issue from someone mentioning he always gets a purity of 100% even though its system is not perfect. Someone replied he should rather use DiarizationPurity and DiarizationCoverage for a speaker change detection task, which is the task I want to perform.

I tried them on a toy example :

from pyannote.core import Annotation, Segment from pyannote.metrics.diarization import DiarizationPurity, DiarizationCoverage purity = DiarizationPurity() coverage = DiarizationCoverage() reference = Annotation() reference[Segment(1, 2)] = "a" reference[Segment(3, 5)] = "b" hypothesis = Annotation() hypothesis[Segment(1, 5)] = "A"

I get a purity of 66,66% where as I would expect a purity of 50% (for the hypothesis segment A, the most covering segment is the segment b with an overlap of 2 => purity = 2/4=0.5

Could you explain where I am wrong in my understanding? And tell me how I should use those metrics?

Thank you in advance for the advises/explanations!

pyannote / pyannote-metrics

Issue understanding the outputs of coverage and purity metrics #55