Closed vbrignatz closed 4 years ago
Thanks @vbrignatz - this is a very nice addition to the package 🎉
I have been working in parallel on refactoring pyannote.database.custom
(and the introduction of custom data loaders) to make it much more flexible. Work is in progress in branch custom
(or pull request #51).
This will undoubtedly conflict with the changes you propose in this pull request. Therefore, I will come back to this pull request when #51 has been merged.
Feel free to give your opinion on #51 as well (in particular how it could be improved to make speaker identification custom protocols easier to define).
Hi @vbrignatz, would you mind updating your PR to work on custom
branch?
I have added a bunch of things (and a proper documentation) that should make the integration of new custom tasks (such as speaker verification) easier.
If you do not want or have time to do it, please let me know and I will merge custom
branch as it is. Otherwise, it can wait for your updated PR.
Hi @hbredin, I will work on this tomorow. FYI, my thinking is that I should create :
TRIALLoader
class to load the trials fileDURLoader
class to load the durations filessubset_trial_iter
function that will create the trial fuction needed in SpkVerif protocolsand that I should modify :
add_custom_protocols
create_protocol
to support the custom SpkVerif protocols.
Hi @hbredin, I will work on this tomorow.
Great. Thanks!
the
DURLoader
class to load the durations files
It could be something slightly more generic that expects a text file (e.g. with .map
suffix) with the following uri value
format:
filename1 value1
filename2 value2
filename3 value3
I already think of two use cases for this kind of data loaders:
file duration
filename1 60.0
filename2 123.450
filename3 32.400
audio domain:
filename1 radio
filename2 radio
filename3 phone
The only issue I foresee is how to make sure that durations are returned as float
and domains as str
. I think pandas.read_csv
(pandas
is already a requirement for pyannote.database
anyway) is smart enough to do the conversion itself but maybe there is another way...
the
subset_trial_iter
function that will create the trial fuction needed in SpkVerif protocols
Yes. Will you always assume that try_with
contains the whole file?
Or do you have any idea how we could support trials with file excerpts?
It is OK if your answer to the second question is "no": I'll live with that :-)
and that I should modify :
add_custom_protocols
andcreate_protocol
Yes!
And an update to the README for completeness ;-)
Thanks again. Looking forward to it!
I modified the custom.py for it to support the creation of speaker verification protocols on the fly.
I added two keys in the configuration file
database.yml
:I tested the validation and the training of a speaker embedding model with Voxceleb2 as the custom dataset I named myVoxCeleb and it worked.