Open hbredin opened 1 year ago
cc @clement-pages
I am not assigning this issue to you but just wanted to let you know that I took note of what we discussed today.
I have just pushed two PRs that should make things much faster:
pyannote.database
PR relies on vanilla csv
library instead of pandas
pyannote.core
PR switches from sortedcontainers.SortedDict
to vanilla dict
in Annotation
internals (making Annotation.__init__
orders of magnitude faster).I still need to make sure those PRs do not break anything but you could already try them on your use case (this requires that you install both pyannote.database
and pyannote.core
from the corresponding branches).
RTTMLoader class is extremely slow for large RTTM files containing annotation of multiple audio files (e.g. VoxCeleb dataset).
We should make it faster!