maum-ai / cotatron

Official code for Cotatron @ INTERSPEECH 2020
https://mindslab-ai.github.io/cotatron
BSD 3-Clause "New" or "Revised" License
212 stars 32 forks source link

What change need make to compatible with Hi-fi GAN? #15

Open faranaziz opened 3 years ago

faranaziz commented 3 years ago

I mean pre processing the mels. https://github.com/jik876/hifi-gan/issues/61

seungwonpark commented 3 years ago

If you're using the Hi-fi GAN (as shown in Assem-VC), then you should have a look at both mel calculation code and the configuration:

You'll need to change the first one to https://github.com/jik876/hifi-gan/blob/master/meldataset.py#L49-L72, and the second one to https://github.com/jik876/hifi-gan/blob/master/config_v3.json#L17-L27. Hope that helps.

faranaziz commented 3 years ago

Thanks so much. Is this how you trained the new upcoming model?

seungwonpark commented 3 years ago

No. We changed the mel calculation script and configuration to match the Hifi-GAN’s version.

faranaziz commented 3 years ago

I make suggested change and get artifcat like reported here: https://colab.research.google.com/gist/tulasiram58827/8a7660ab21ca3ea246141bbe4b0f87c0/hifigan_issue.ipynb

Rashi2011 commented 3 years ago

I am trying to take mel from fastspeech2 and put that is hifiGAN to generate audio but all I am getting is noise . please suggest some Ideas.