pyannote / pyannote-database

Reproducible experimental protocols for multimedia (audio, video, text) database
MIT License
83 stars 28 forks source link

Introducing pyannote.database.registry #89

Closed FrenchKrab closed 1 year ago

FrenchKrab commented 1 year ago

PR summary

Previously, only a single configuration file could be loaded. All data was stored in global variables in the pyannote.database __init__.py file. The way things were loaded meant it was not possible to use multiple configuration files ("database.yml").

To solve that, a new class PyannoteDbConfig is responsible for loading yml files and "merging" the different configuration files . This simply means merging the defined databases and handling cases where the same protocols are defined multiple times.

Most of the logic remains similar, with global functions simply being moved to the new class, updating its state instead of global variables. Only the FileFinder class had some of its path resolving logic moved upstream to PyannoteDbConfig (because it relies on the database.yml path used, which FileFinder can't know).

What's left

hbredin commented 1 year ago

For some reason I cannot edit this PR. Can you make it editable by me?

FrenchKrab commented 1 year ago

I (should have) fixed the raised issues (down to 7 edited files, that's indeed better)

FrenchKrab commented 1 year ago

These commits should fix the raised issues

hbredin commented 1 year ago

I have found two packages where the change in FileFinder API would break things:

Not sure what to do about this yet but I wanted to keep track of this potential problem.

hbredin commented 1 year ago

@FrenchKrab I would not mind a final review on this (but that can wait for next week)