The generated output contains a lot of noise.

kan-bayashi / ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

MIT License

1.54k stars 339 forks source link

I am currently training Multiband-MelGAN and my Text2Mel model is based on FastSpeech2, trained with ESPnet. During synthesis, the model outputs have a lot of noise, even though I am using a checkpoint-50000 and the hyperparameters seem to be fine upon inspection. Everything appears normal in the predictions directory, but I cannot get the expected output during synthesis. I am attaching the Text2Mel and Mel2Wav config files and would appreciate it if you could take a look and let me know if you understand the issue. configs.zip log with 3 generated samples.zip

kan-bayashi / ParallelWaveGAN

The generated output contains a lot of noise. #410