pyannote / pyannote-database

Reproducible experimental protocols for multimedia (audio, video, text) database
MIT License
79 stars 26 forks source link

Faster RTTMLoader #94

Open hbredin opened 1 year ago

hbredin commented 1 year ago

RTTMLoader class is extremely slow for large RTTM files containing annotation of multiple audio files (e.g. VoxCeleb dataset).

We should make it faster!

hbredin commented 1 year ago

cc @clement-pages

I am not assigning this issue to you but just wanted to let you know that I took note of what we discussed today.

hbredin commented 1 year ago

I have just pushed two PRs that should make things much faster:

I still need to make sure those PRs do not break anything but you could already try them on your use case (this requires that you install both pyannote.database and pyannote.core from the corresponding branches).