feeding information on annotations from a csv (non-rttm like) format

pyannote / pyannote-metrics

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems

MIT License

183 stars 30 forks source link

Description

We're working on calling pyannote.metrics (specifically, to get identification error rate) from within R (package tutorial), so we are hoping to use as few python packages & other adaptations as possible. Our annotations are in a csv, which contains the following info for each annotated segment:

onset
offset
talker

Is it truly necessary to create rttm-like files? Or could we hack in as in the code stub below?

Steps/Code to Reproduce

This code doesn't run, it is just a stub of what the code would look like:

from pyannote.core import Annotation, Segment
reference = Annotation()

# read csv

# create a segment matrix, with things like (sorry, bash syntax cameo!):
# reference[Segment($onset, $offset)] = $talker

from pyannote.metrics.diarization import Identification
metrics["ider"] = identification.IdentificationErrorRate(parallel=True)

# write csv with results

Thanks in advance!

from pyannote.core import Segment, Annotation manual_reference = Annotation() for onset, offset, speaker in ...: manual_reference[Segment(onset, offset)] = speaker automatic_hypothesis = Annotation() for onset, offset, speaker in ...: automatic_hypothesis[Segment(onset, offset)] = speaker from pyannote.metrics.identification import IdentificationErrorRate metric = IdentificationErrorRate() error_rate = metric(manual_reference, automatic_hypothesis) print(error_rate)

pyannote / pyannote-metrics

feeding information on annotations from a csv (non-rttm like) format #52

Description

Steps/Code to Reproduce