HarryVolek / PyTorch_Speaker_Verification

PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
BSD 3-Clause "New" or "Revised" License
575 stars 165 forks source link

Shuffling wav files in dataloader does not ensure that all the training files are checked in each epoch #73

Open dkatsiros opened 3 years ago

dkatsiros commented 3 years ago

As a results the model is trained on N*M utterances per epoch and not the whole training set. This affects the convergence as well as possible extensions of the code (e.g. early stopping).

where: N=number of speakers per batch, M=number of utterances per speaker per batch according to the referenced paper. https://github.com/HarryVolek/PyTorch_Speaker_Verification/blob/10e159a8d3255503c0184cde4eb7097968857a31/data_load.py#L39-L40

dkatsiros commented 3 years ago

For TIMIT dataset, where M=9 (I think) the dataloader may be ok. The issue appears in large datasets such as VoxCeleb1 or VoxCeleb2 where M>50.

dkatsiros commented 3 years ago

@HarryVolek Can you check this please ? If that is the case I will pr