Closed llearner closed 3 years ago
sorry, i find some problem in my config...
I got EER 1.1188% vs 0.88% in "The ins and outs of speaker recognition: lessons from VoxSRC 2020", does anyone duplicate it and get a different result ? Here is my config: model: ResNetSE34V2 n_mels: 64 log_input: True trainfunc: softmaxproto batch_size: 450 nPerSpeaker: 2 augment: True lr: 0.001 lr_decay: 0.75 weight_decay: 5e-5 test_interval: 16 encoder_type: ASP max_epoch: 256 max_seg_per_spk: 500 eval_frames: 400 margin: 0.2 scale: 30 nOut: 512
@llearner
Sorry, I haven't tried it, but I will do it later.
In addition, I guess you can try ResNetSE34Half
instead of ResNetSE34V2.
See #89
@joonson Mr.joonson, thank you for your contribution to speaker recognition!
According to "The ins and outs of speaker recognition: lessons from VoxSRC 2020", we try to reproduce HA4 H/ASP AP+softmax using 1 Tesla-V100 GPU, and mixedprec is True, but our EER is much higher than 0.88%, could you please provide your configuration? Thanks a lot!
This is our config: model: ResNetSE34Half n_mels: 64 log_input: True trainfunc: softmaxproto batch_size: 400 nPerSpeaker: 2 augment: True lr: 0.001 lr_decay: 0.75 weight_decay: 5e-5 test_interval: 16 encoder_type: ASP max_epoch: 256 max_seg_per_spk: 100