Overlaps - Githubissues

pyannote / pyannote-metrics

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

http://pyannote.github.io/pyannote-metrics

MIT License

186 stars 33 forks source link

Overlaps #61

Closed EmreOzkose closed 2 years ago

EmreOzkose commented 2 years ago

Can we find overlaps with pyannote?

hbredin commented 2 years ago

I did not understand the question.

EmreOzkose commented 2 years ago

For example lets say we have a 5 minutes speech which contains speechs of speaker A and speaker B.

speaker A segment speaks between 0:00 - 3:00 speaker B segment speaks between 2:00 - 5:00

So we have 1 minute overlap between 2:00 - 3:00. Can we obtain overlap duration with Pyannote?

hbredin commented 2 years ago

You opened this issue in pyannote.metrics, not pyannote.audio. Therefore, I am confused:

are you interested in detecting overlapped speech regions in an audio file (in which case you should look at pyannote.audio) ?
or are you interested in computing the intersection of two segments given their start and end time (in which case you should look at pyannote.core documentation) ?

EmreOzkose commented 2 years ago

For example, we have 2 annotations like below

reference = Annotation()
reference[Segment(0, 15)] = 'A'
reference[Segment(10, 20)] = 'B'

hypothesis = Annotation()
hypothesis[Segment(0, 15)] = 'a'
hypothesis[Segment(15, 20)] = 'b'

and we can calculate DER with that

from pyannote.metrics.diarization import DiarizationErrorRate
diarizationErrorRate = DiarizationErrorRate()

result = diarizationErrorRate(reference, hypothesis, detailed=True, uem=Segment(0, 40))
result

The result is that

{'false alarm': 0.0, 'total': 25.0, 'correct': 20.0, 'missed detection': 5.0, 'confusion': 0.0, 'diarization error rate': 0.2}

Can we obtain overlap: 5.0 in result dictionary or with another function?

hbredin commented 2 years ago

Not out of the box. This might help, though.

EmreOzkose commented 2 years ago

I guess this script will help me, thank you so much.

EmreOzkose commented 2 years ago

reference.discretize(resolution=0.01) helped me, but I required extra calculation for finding total overlap duration. Then I dig into reference object and found get_overlap(). Finally, this function helped me.

from pathlib import Path
from pyannote.database.util import load_rttm

def get_overlaps(rttm_path: str):
    """
        obtain total overlap duration between speakers in rttm file.
    """
    name = Path(rttm_path).stem
    rttm_objejct = load_rttm(rttm_path)[name]

    total_dur = 0
    for segment in rttm_objejct.get_overlap().segments_list_:
        total_dur += segment.duration
    return total_dur