rishikksh20 / HiFi-GAN

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
MIT License
81 stars 20 forks source link

Mel loss computation in the generator #8

Closed akashrajkn closed 3 years ago

akashrajkn commented 3 years ago

In the training step, at some point, mel loss is computed:

mel_fake = stft.mel_spectrogram(fake_audio.squeeze(1))

mel_fake will not have a gradient function. If you check out the stft.mel_spectrogram function, you can see this line magnitudes = magnitudes.data which basically removes gradient (I guess this was done for better performance?). This results in loss_mel not having a gradient function and it won't contribute to training. If you agree, I can submit a PR correcting this.

Cheers, and really nice that you implemented the paper :)

rishikksh20 commented 3 years ago

Sure