Closed francislata closed 3 years ago
Thank you for your interest.
We didn't conduct experiment with other models. Since the MPD takes raw waveform as the input and does not have any coupling with generator architecture, it will not be complicated to experiment with applying it to other models.
In our experiments, the feature matching loss affected perceptual quality. However, the quality degradation was different depending on datasets. Since the feature matching loss makes computation of backward operation heavy, it will be a good idea to experiment without it.
@jik876 This is more of a high-level question.
1) Table 4 of section 4.2 shows the application of MPD on MelGAN. I understand that HiFi-GAN and MelGAN uses MSD with almost the same settings. Have you done experiments with other GAN-based vocoders applying MPD and showed improvements in perceptual quality?
2) The feature-matching loss is used in the MSD by HiFi-GAN as well as MelGAN. Did you do any experiments with and without this loss and how it affects the perpetual quality?
Thank you!