Open nukes opened 3 years ago
Nope I haven't tried. But I am planning to do.
Hi any update on this ? I try this idea. but the result is not good.I use fullband stft + subband stft + mel + adv loss combination and the predicted wave has artifact in a specific frequency bin. After 400K step, this artifact still does not disappear. I want to know if you still meet the same issue and whether you still use the mel loss as part of the generator loss ?
@nukes I trained it around 1 M and these artefacts band disappeared around 800k and quality is also good.
Good news! what i obeserve is that this artifacts appears periodically . Something like disappears in 300k, then appeears in 310k. Did you observe the same pattern ? And, do you use mel loss ?
@nukes Yes, after 800k that periodicity decreased and most of the time artifacts are less or none. Mel Loss throws an error because the generated audio exceeds the value of 1 which creates problem when we convert wav to mels for error calculation, its not often but sometimes it's throw an error mostly around 20k to 40k steps so I start training with mel loss, adv loss, STFT and sub STFT losses but around 20k when mel loss errors pops up I just comment mel loss and for remaining training I only used STFT, sub STFT losses with Adv loss.
Got it! i am still training my model and i will let you know the result once to 800k.
Also do you think it is worthy to try MultiStepLR learning rate scheduler just like mb-melgan? I saw the subband loss fluctuates dramatically while the mb-melgan learning curve is much more smooth and the periodical artifact disappears around 300k-400k.
@nukes Yeah I have same thought on that.
Hi i try the idea "mb-hifigan", but the result is not good. At the high-frequncy bins, the structure is quite blurry, while the org-hifigan has a much better performance at high-freq bins. did you see the see result? org-hifigan: mb-hifigan:
hi,What's the result of mb-hifigan now?Is it better now?
Hi, Did you try the idea multiband hifigan?