Text2Mel input to MelGan outputs noisy audio file without any speech

seungwonpark / melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

http://swpark.me/melgan/

BSD 3-Clause "New" or "Revised" License

633 stars 116 forks source link

Text2Mel input to MelGan outputs noisy audio file without any speech #50

Closed deepconsc closed 3 years ago

deepconsc commented 4 years ago

Hey!

I've retrained the text2mel model, by cutting out mel reduction part in preprocessor, and changing the hparams to:

hop_length = 256 win_length = 1024 max_N = 180 # Maximum number of characters. max_T = 210 # Maximum number of mel frames. e = 512 # embedding dimension d = 256 # Text2Mel hidden unit dimension

I'm trying to feed generated mels to MelGan, but output audio file is just noisy honk. Any ideas?

c1a1o1 commented 3 years ago

The same to you!