Open dkatsiros opened 4 years ago
For TIMIT dataset, where M=9
(I think) the dataloader may be ok. The issue appears in large datasets such as VoxCeleb1 or VoxCeleb2 where M>50
.
@HarryVolek Can you check this please ? If that is the case I will pr
As a results the model is trained on
N*M
utterances per epoch and not the whole training set. This affects the convergence as well as possible extensions of the code (e.g. early stopping).where:
N=number of speakers per batch
,M=number of utterances per speaker per batch
according to the referenced paper. https://github.com/HarryVolek/PyTorch_Speaker_Verification/blob/10e159a8d3255503c0184cde4eb7097968857a31/data_load.py#L39-L40