Janghyun1230 / Speaker_Verification

Tensorflow implementation of "Generalized End-to-End Loss for Speaker Verification"
MIT License
349 stars 104 forks source link

Lack of shuffle the batch when training #29

Closed JisongXie closed 3 years ago

JisongXie commented 3 years ago

It seems that there is a lack of shuffle the batch when training. Thus the loss will decrease very fast, and it learns nothing. It only learns the fixed similarity matrix output. After training, the model cannot work. Each batch of data should be permuted and then unpermuted. There is a example of pytorch version. Pytorch_Speaker_Verification:

        mel_db_batch = torch.reshape(mel_db_batch, (hp.train.N*hp.train.M, mel_db_batch.size(2), mel_db_batch.size(3)))
        perm = random.sample(range(0, hp.train.N*hp.train.M), hp.train.N*hp.train.M)
        unperm = list(perm)
        for i,j in enumerate(perm):
            unperm[j] = i
        mel_db_batch = mel_db_batch[perm]
        #gradient accumulates
        optimizer.zero_grad()
        embeddings = embedder_net(mel_db_batch)
        embeddings = embeddings[unperm]
        embeddings = torch.reshape(embeddings, (hp.train.N, hp.train.M, embeddings.size(1)))