Question: computation of detection accuracy

pyannote / pyannote-metrics

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

http://pyannote.github.io/pyannote-metrics

MIT License

183 stars 30 forks source link

Question: computation of detection accuracy #32

Closed GladB closed 5 years ago

GladB commented 5 years ago

Hello,

In the documentation of the detection accuracy metric, you mention that "Gaps in the inputs considered as the negative class". What about silence that is not between two annotations? Say one annotation has speech from 0 to 30 seconds out of two minutes and the other has speech 30 to 60 out of two minutes, does this mean that the last minute of agreement (on silence) will not be taken into account and the accuracy will be 0 (there will be no true negative values)?

Thank you!

hbredin commented 5 years ago

I agree that the first part of the documentation of DetectionAccuracy might be confusing. I believe the second part answers your question though:

accuracy = (tp + tn) / total, where tp is the duration of true positive (e.g. speech classified as speech), tn is the duration of true negative (e.g. non-speech classified as non-speech), and total is the total duration of the input signal.

tn also includes non-speech at the very beginning and the very end of the file.

GladB commented 5 years ago

I am very sorry, but I think I am missing something. I have been giving Annotation() objects as arguments of DetectionAccuracy(), and as far as I know, the support of Annotation() objects is from the start of the first segment to the end of the last segment, so the only TN found in this case are silences between segments, not at the beginning or end. Is there a way to specify the start time and end time of an annotation independently of the segments it contains? (I am now realizing that this may be a pyannote.core issue rather than pyannote.metrics)

hbredin commented 5 years ago

Oh. I understand your problem.

You need to provide the metric with an evaluation map. In your case, you should do something like:

from pyannote.core import Timeline, Segment
uem = Timeline(segments=[Segment(0, file_duration)])
accuracy = metric(reference, hypothesis, uem=uem)

GladB commented 5 years ago

Oh great! Thank you very much.