mycrazycracy / tf-kaldi-speaker

Neural speaker recognition/verification system based on Kaldi and Tensorflow
Apache License 2.0
32 stars 16 forks source link

Inference script #6

Closed shatealaboxiaowang closed 4 years ago

shatealaboxiaowang commented 4 years ago

Hi Dr.Liu: Thank you very much for your sharing, I have seen your eer result(eer=0.02) is state of the art, but i have a few question for you. (1) I don't see the predict code, i just want to try the inference ; (2)How many days did you train on the voxceleb dataset? Looking forward to your reply. Thank you!!

mycrazycracy commented 4 years ago

Hi, (1) I must point out that EER=2% is achieved when the VoxCeleb2 and VoxCeleb1 dev set is used to train the model. In the official protocol, only the VoxCeleb2 dev set can be used. So currently I have no idea about the performance if only the VoxCeleb2 dev set is used. (2) The inference code is included in the script. Take a look at egs/voxceleb/v1/run.sh and you can search nnet/run_extract_embeddings.sh which extracts the speaker embedding. (3) I used a single P100 GPU and the training takes about 2.5 days (178031s exactly). Some methods can be involved to accelerate the training process but I didn't do that.

If you have any other questions, please let me know.

shatealaboxiaowang commented 4 years ago

Thank you very much,I will try it and consult you when I encounter problems 。