Walleclipse / Deep_Speaker-speaker_recognition_system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
245 stars 81 forks source link

Hard negative mining #47

Closed tuanphan09 closed 3 years ago

tuanphan09 commented 4 years ago

Based on your report, training with random batch was better than using hard negative mining, right?

Walleclipse commented 4 years ago

Actually not. The better strategy is, in the earlier stage of the learning (0\~80k steps) using softmax pretraining or random batch learning, then (80\~110k learning steps) using hard negative mining. In report. EER for random batch strategy is 8%, EER for hard negative mining is 6%.

tuanphan09 commented 4 years ago

Thank you.

Could you tell me how did you get the 17k pretrained model in checkpoint model (with/without hard negative mining, how many steps?)? I'm trying to compare my model with yours and working on this task with GAN like this paper

Walleclipse commented 4 years ago

You can get pretrained model in the following steps.

  1. In constant.py , set PRE_TRAIN = False
  2. run pretraining.py I approximately run 80,000 steps in softmax pretraining.

I am not familiar with GAN, and I look forward to seeing your results on GAN.