redmist328 / APNet2

Source code of APNet2, a vocoder
MIT License
49 stars 11 forks source link

Phase loss weights #1

Open Kristopher-Chen opened 10 months ago

Kristopher-Chen commented 10 months ago

Hi, I just wonder whether the weight of phase loss is too large when set to 100. In my experiment, the original phase is about 3, and mel/amp loss is only about 0.2. Is it reasonable to reduce phase loss weight to 10 or 5? Or could you show your loss curve? Thanks a lot!

redmist328 commented 10 months ago

1 This is my loss curve. We have the same problem during training. However, during the experiment, it was found that if the weight based on the phase loss function is too small, the sound quality will deteriorate. Our guess is that phase needs to be given a higher weight than amplitude. At the same time, you can also see from the image I gave that Instantaneous_Phase_Loss has almost not dropped, and this problem also needs to be solved.

softrimewu commented 8 months ago

Have you checked your MRD loss? It is a bit confusing for me to use 0.1 weight for generator mrd loss while use 1.0 weight for mrd loss. In my observation the prediction of MRD becomes really close to 0 on generated samples while 1 on real samples which seems reasonable for this loss weight but unexpected.

redmist328 commented 8 months ago

Hello softrimewu, you're absolutely right. Regarding the issue of the MRD coefficient, we refer to the configuration of Vocos. As for the problem of discriminator overfitting, it is mentioned in this article. Although using such a discriminator directly can improve the quality of the waveform to a certain extent, if you want to further address the overfitting issue, you can refer to the approach mentioned in PhaseAug. Due to time constraints, we haven't had a chance to try it ourselves.