TaoRuijie / ECAPA-TDNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
MIT License
594 stars 113 forks source link

training set is not 5 times bigger after augmentation #14

Closed youyou098888 closed 2 years ago

youyou098888 commented 2 years ago

I notice that in dataloader, the size of training set is the same size as original audio size after augmentation.

So, adding augmentation is not to increase the amount of training data, only to increase the diversity of it ?

TaoRuijie commented 2 years ago

Er I think that is equal.

If you do not do aug, you may finish training and get the best result in 10 epoches, since model can learn these easy and common data very fast.

In this project, you may need 50-60 epoches to get the best result, because model needs to learn many different kinds of data.

So the size of training data is increase (although in each epoch it is unchanged), also the diversity is increase.

youyou098888 commented 2 years ago

thanks, that is very clear.