philipperemy / deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.
MIT License
905 stars 241 forks source link

transfer learning with pretrained models: is pre-training needed with new data? #86

Closed zabir-nabil closed 3 years ago

zabir-nabil commented 3 years ago

Hi @philipperemy , thanks for sharing this awesome repository.

I'm training on your pretrained triplet loss model with my data, should I run the pretraining again or I can just train in triplet setup with a smaller learning rate?

zabir-nabil commented 3 years ago

I'm training without pretraining with Adam (lr = 0.0001) right now, it may take few days to train.

philipperemy commented 3 years ago

@zabir-nabil probably triplet training should be enough! The softmax training is just a pre-training. If it takes too long you can try to increase the LR a bit. But if you see your loss going up and down, revert it to the default one!

zabir-nabil commented 3 years ago

Hi, @philipperemy I trained on multiple datasets, but I got even poor performance than your pretrained weight. After training on dataset 1, I got merely this: EER 1: 0.2744138634046891 EER 2: 0.27443267776096825 EER: 0.27442327058282867 EER threshold: 0.9836212992668152 Acc: 0.725700365408039

The interesting part is EER threshold is so high.

After training on dataset 2 + dataset 1, I even got poorer results: EER 1: 0.34413863404689093 EER 2: 0.34432677760968233 EER: 0.34423270582828663 EER threshold: 0.9998130798339844 Acc: 0.6560292326431182

I trianed on voxceleb1 + https://www.openslr.org/38/

Do you have any suggestions?