Closed fayejf closed 3 years ago
@nithinraok
Thanks! This is an imprecision in the documentation.
For diarization, it should read: "total is the total duration of speech turns in the reference". Can you please provide a link to where exactly you found this error? Or, even better, make a PR to fix it?
But, maybe there is also some misunderstanding on your side. In your example above, the same speaker "B" seems to speak twice during the [0.5, 1] time range. When can this happen?
We found the error when we were checking for False alarm and missed detection from DetectionErrorRate and DiarizationErrorRate. Theoretically, they should match but they weren't matching when there is overlap in rttm file.
Even if we replace a[Segment(0.5, 2)]='B'
with A
we get the same error -> different total length.
As stated in my previous comment, this behavior is actually expected. I agree that the documentation is misleading, though. I'd gladly merge a PR fixing the documentation.
For detection, total
is the total duration of speech activity (i.e. where at least one person speaks).
This is meant as an evaluation metric for binary classification (a.k.a detection) task.
For diarization, total
is the total duration of speech turns (i.e. the sum of speech turns duration over all speakers).
Hence, two overlapping speakers are counted twice. This is what md-eval.pl
or dscore
do as well and is the way the community does speaker diarization evaluation.
Thanks, @hbredin for the clear explanation and reference. Submitted PR for doc correction: https://github.com/pyannote/pyannote-metrics/pull/50
PR merged. Thanks!
Description
Different Total values from Detection Error Rate and Diarization Error Rate when there is overlap segment
Steps/Code to Reproduce
from pyannote.core import Annotation, Segment from pyannote.metrics import detection, diarization
a = Annotation('hello') a[Segment(0, 1)]='B' a[Segment(0.5, 2)]='B' # overlap
metric_det = detection.DetectionErrorRate(collar=0, skip_overlap=False) metric_det(a,a) print(metric_det['total']) # 2
metric_dia = diarization.DiarizationErrorRate(collar=0, skip_overlap=False) metric_dia(a,a) print(metric_dia['total']) # 2.5
Expected Results
Quote from the documentation: In Detection: " total is the total duration of speech in the reference." In Diarization: "total is the total duration of speech in the reference"
We expect the TOTAL from two conditions are identical.
Actual Results
metric_det['total']= 2 metric_dia['total'] =2.5
Versions
pyannote.core==4.1 pyannote.metrics==3.0.1