sh-lee-prml / HierSpeechpp

The official implementation of HierSpeech++
MIT License
1.19k stars 135 forks source link

cant replicate the training curve for libritts 960 #36

Closed meriamOu closed 9 months ago

meriamOu commented 10 months ago

hey, thank you for sharing the training cure. I am training the model on Libritts 960 using exactly your preprocessing and training files. However my training curve doesn t show the same losses you got. while you reached 2.74 w2v loss at 100k (https://github.com/sh-lee-prml/HierSpeechpp/issues/4#issuecomment-1827099857 ,) my training converges at error of 6. The model shows slurries as well

Screenshot 2024-02-04 at 2 02 02 PM

what could be the reason for that ? thank you

sh-lee-prml commented 10 months ago

Oh sorry.

I have noticed a difference.

When I trained the model, I missed loss mask for w2v loss

so after some test, I added a loss mask in trainer as below

            # This is the paper version, and my tensorboard log
            loss_w2v = F.l1_loss(w2v_x, w2v_predicted) * hps.train.c_mel 
             # This is a released version
             loss_w2v = (torch.sum(torch.abs(w2v_x - (w2v_predicted*mask)))/(torch.sum(mask)*1024)) * hps.train.c_mel

Sorry for the confusion.

To be honest, each model is good to me so it is hard to recommend one...

meriamOu commented 10 months ago

hey thank you so much for your immediate rely. The problem is not just about the loss numbers , it is the slurries that deteriorates the speech. I will swap the loss with the loss of the tensorboard log Thank you so much