Closed meriamOu closed 9 months ago
Oh sorry.
I have noticed a difference.
When I trained the model, I missed loss mask for w2v loss
so after some test, I added a loss mask in trainer as below
# This is the paper version, and my tensorboard log
loss_w2v = F.l1_loss(w2v_x, w2v_predicted) * hps.train.c_mel
# This is a released version
loss_w2v = (torch.sum(torch.abs(w2v_x - (w2v_predicted*mask)))/(torch.sum(mask)*1024)) * hps.train.c_mel
Sorry for the confusion.
To be honest, each model is good to me so it is hard to recommend one...
hey thank you so much for your immediate rely. The problem is not just about the loss numbers , it is the slurries that deteriorates the speech. I will swap the loss with the loss of the tensorboard log Thank you so much
hey, thank you for sharing the training cure. I am training the model on Libritts 960 using exactly your preprocessing and training files. However my training curve doesn t show the same losses you got. while you reached 2.74 w2v loss at 100k (https://github.com/sh-lee-prml/HierSpeechpp/issues/4#issuecomment-1827099857 ,) my training converges at error of 6. The model shows slurries as well
what could be the reason for that ? thank you