WeidiXie / VGG-Speaker-Recognition

Utterance-level Aggregation For Speaker Recognition In The Wild
362 stars 98 forks source link

There are some problems with voxlb2_train.txt and voxlb2_val.txt #48

Closed lyu-joe closed 4 years ago

lyu-joe commented 4 years ago

There are some identical audio between voxlb2_train.txt and voxlb2_val.txt 296870716

WeidiXie commented 4 years ago

Oh, this validation set is useless. The task is for speaker verification is only for openest setting, meaning any of the audio clip from the test speaker should not appear in training.

So you should check the overlap between voxceleb2 and voxceleb1, because the model trained with voxceleb2 will be tested on voxceleb1.

lyu-joe commented 4 years ago

I still have some questions,You train this speaker verifier with the training process of a classifier. Your monitor when you are training is training loss, so you did not use of a validation set until the end,right ,

WeidiXie commented 4 years ago

while designing architecture, searching hyper-parameters, training epochs, etc, you need to split the trainset to train and val, make them disjoint.

Once you decide the best parameters, then train on all data with cyclic learning rate.

lyu-joe commented 4 years ago

when i split the trainset to train and val, make them disjoint. i am using the method of training classifier,means that the train and val are shared all person ,but the audio is from different audios of the same person

WeidiXie commented 4 years ago

yes.

lyu-joe commented 4 years ago

Thank you for your prompt and patient response.finally ,what is you final training loss,thanks again

WeidiXie commented 4 years ago

Can't remember this, but I think the final accuracy you will get on training set is around 92-93%.

THtanghuan commented 4 years ago

hey,when I load your pre-trained model it raise ''ValueError: You are trying to load a weight file containing 80 layers into a model with 81 layers''. i did not change the structure of you model.so why

WeidiXie commented 4 years ago

change the way model is loaded: https://github.com/WeidiXie/VGG-Speaker-Recognition/blob/837735b72033f4d2369fd3ba0d0f5d632b14fc07/src/main.py#L81

if single GPU: network.load_weights(os.path.join(args.resume), by_name=True)

THtanghuan commented 4 years ago

which layer is the extra layer

WeidiXie commented 4 years ago

if you are doing verification, then the last classifier is redundant.

Fan0fan commented 2 years ago

Can you send me the voxlb2_val.txt? Maybe the author deleted this file. Looking forward to your reply.