feat: add support for custom SpeakerVerification protocols

vbrignatz commented 4 years ago

I modified the custom.py for it to support the creation of speaker verification protocols on the fly.

I added two keys in the configuration file database.yml :

duration for a duration file, those files seems common in these tasks.
trial for a trial file, needed for spk verif.

I tested the validation and the training of a speaker embedding model with Voxceleb2 as the custom dataset I named myVoxCeleb and it worked.

hbredin commented 4 years ago

Thanks @vbrignatz - this is a very nice addition to the package 🎉

I have been working in parallel on refactoring pyannote.database.custom (and the introduction of custom data loaders) to make it much more flexible. Work is in progress in branch custom (or pull request #51).

This will undoubtedly conflict with the changes you propose in this pull request. Therefore, I will come back to this pull request when #51 has been merged.

Feel free to give your opinion on #51 as well (in particular how it could be improved to make speaker identification custom protocols easier to define).

hbredin commented 4 years ago

Hi @vbrignatz, would you mind updating your PR to work on custom branch?

I have added a bunch of things (and a proper documentation) that should make the integration of new custom tasks (such as speaker verification) easier.

If you do not want or have time to do it, please let me know and I will merge custom branch as it is. Otherwise, it can wait for your updated PR.

vbrignatz commented 4 years ago

Hi @hbredin, I will work on this tomorow. FYI, my thinking is that I should create :

the TRIALLoader class to load the trials file
the DURLoader class to load the durations files
the subset_trial_iter function that will create the trial fuction needed in SpkVerif protocols

and that I should modify :

add_custom_protocols
create_protocol

to support the custom SpkVerif protocols.

hbredin commented 4 years ago

Hi @hbredin, I will work on this tomorow.

Great. Thanks!

the DURLoader class to load the durations files

It could be something slightly more generic that expects a text file (e.g. with .map suffix) with the following uri value format:

filename1 value1
filename2 value2
filename3 value3

I already think of two use cases for this kind of data loaders:

file duration

filename1 60.0
filename2 123.450
filename3 32.400

audio domain:

filename1 radio
filename2 radio
filename3 phone

The only issue I foresee is how to make sure that durations are returned as float and domains as str. I think pandas.read_csv (pandas is already a requirement for pyannote.database anyway) is smart enough to do the conversion itself but maybe there is another way...

the subset_trial_iter function that will create the trial fuction needed in SpkVerif protocols

Yes. Will you always assume that try_with contains the whole file?
Or do you have any idea how we could support trials with file excerpts? It is OK if your answer to the second question is "no": I'll live with that :-)

and that I should modify : add_custom_protocols and create_protocol

Yes!

And an update to the README for completeness ;-)

Thanks again. Looking forward to it!

pyannote / pyannote-database

feat: add support for custom SpeakerVerification protocols #50