Jungjee / RawNet

Official repository for RawNet, RawNet2, and RawNet3
MIT License
357 stars 55 forks source link

how to create the test_list for a new test dataset #33

Closed wwyl2000 closed 5 months ago

wwyl2000 commented 6 months ago

Hi Jungjee,

Thanks for sharing your great work! Could you please share the code you used to create the test_list for a new test dataset? For example, if i want to test using TIMIT corpus.

head -5 vox1_veri_test2.txt

1 id10270/x6uYqmx31kE/00001.wav id10270/8jEAjG6SegY/00008.wav
0 id10270/x6uYqmx31kE/00001.wav id10300/ize_eiCFEg0/00003.wav
1 id10270/x6uYqmx31kE/00001.wav id10270/GWXujl-xAVM/00017.wav
0 id10270/x6uYqmx31kE/00001.wav id10273/0OCW1HUxZyg/00001.wav
1 id10270/x6uYqmx31kE/00001.wav id10270/8jEAjG6SegY/00022.wav

Here is what I understand: Column-1: 1, if column-2 and column-3 are from the same speaker, 0 otherwise For a large corpus, like VoxCeleb, how to select coulmn-2 and column-3?

Your help will be greatly appreciated!

Regards, Willy

Jungjee commented 5 months ago

Hi Willy,

Thanks for reaching out. To clarify, the test_list has been made by the authors of the VoxCeleb paper, not me.

I could still add a few notes. There would be several strategies to compose an evaluation protocol (i.e., a set of trials).

The most naive methodology would be to perform a random selection. However, to perform a more thorough assessment of different models, you could make the evaluation protocol composition more sophisticated.

A few practices would include considering the duration of each utterance, their gender (excluding cross-gender trial could be a typical setup as it would be too easy nowadays), age, or even what they're saying (assuming that you have the ground-truth transcription).

wwyl2000 commented 5 months ago

Hi Jungjee,

Thank you very much for your reply. The suggestions are helpful. I did created a test list for my dataset, using random selection with some restriction. For same speaker tests, each speaker's enroll utterance in N is matched against his/her non-enroll utterances, while for different speaker tests, each speaker's utterance is matched against any utterance randomly selected of any other speaker. I may consider other factors as you suggested.

Thanks, Willy

wwyl2000 commented 5 months ago

Completed!