Snowdar / asv-subtools

An Open Source Tools for Speaker Recognition
Apache License 2.0
597 stars 135 forks source link

train the standard xvector model on VoxCeleb1 trainset #7

Closed matln closed 3 years ago

matln commented 4 years ago

I try to train the standard xvector model on VoxCeleb1 trainset using the script runVoxceleb.sh with 4 GPUs. And I completely use the default parameters in runStandardXvector-voxceleb1.py except for the weight decay changed to 5e-1 (I also tried 3e-1), but the result EER is only 3.531% for 21 epoch far embedding with PLDA backend. Unable to achieve 3.028% reported at the bottom of runStandardXvector-voxceleb1.py. Is there something I overlooked or what I need to modify?

Snowdar commented 4 years ago

Hi, in fact, the reported result is based on single GPU and augdropout is used, but runStandardXvector-voxceleb1.py is a baseline configuration with Softmax loss. For small dataset like voxceleb1, the regularizations (weight decay and dropout) are very useful to improve the generalization of model. Just forget the history results in runStandardXvector-voxceleb1.py and a much better result 2.6% is given in https://github.com/Snowdar/asv-subtools#1-voxceleb-recipe-speaker-recognition. Note that, there seems a bad problem/bug w.r.t my experiments between specaugment and multi-GPU. So suggest that do not use multi-GPU to run specaugment training for the time being.

matln commented 4 years ago

Thanks for your advice! I will use specaugmnet for further experiments.

jjjjohnson commented 3 years ago

Hi @Snowdar I cannot understand the relationship between specaugment and multi-GPU training. To my knowledge, the specaugment only happens in getitem of class ChunkEgs, which has nothing to do with smulti-GPU sampler...

Thanks for any help! Junjie

jjjjohnson commented 3 years ago

I guess the problem is utils.set_all_seed(1024) since ddp will spawn processes independently. If you set random seed, all randomness in specaugment is the same for all processes.

Snowdar commented 3 years ago

Yeah, it is this problem about seed.

在 2021年8月2日,下午5:16,JUNJIE JIN @.***> 写道:

 I guess the problem is utils.set_all_seed(1024) since ddp will spawn processes independently. If you set random seed, all randomness in specaugment is the same for all processes.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.