Closed hktxt closed 5 years ago
Hey!
I am really glad that you found my code to be helpful.
Regarding your question, I think you are correct, one row one path.
iden_split.txt
is the file that VGG provided with the dataset: VoxCeleb1 look for Dataset split for Identification. Also, you may take a look at preprocessing.ipynb
notebook which processes the raw downloaded files.
It appears to me that this file consists of a phase
(train: 1 and 2; test: 3) and an audio path
(of a format: id/track/segment
). To verify the fact that the first column is a phase, you may count the number of rows and compare it with the values that are mentioned in the paper in Table 5. Also, it is a reasonable assumption as identification is
identification is treated as a simple classification task, the output of the last layer is fed into a 1,251-way softmax in order to produce a distribution over the 1,251 different speakers.
@vdyashin
I just realized that it was provided by VGG. thanks~ anyway~
more help will be asked for if I get stucked~~~hahaha
Hi~ your code is really helpful. could you please tell me more about the iden_split.txt? is it a text file that contains file paths, one row one path?