Wrong HiFiGAN's MPD? - Githubissues

TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Apache License 2.0

3.76k stars 803 forks source link

for HiFiGAN's stable training, feature_loss of MPD should be computed, so in the each PD, it should return all convolution's outputs, not only the last one. But in each PD, it only returns the last output, and middle outputs are not saved. https://github.com/TensorSpeech/TensorFlowTTS/blob/136877136355c82d7ba474ceb7a8f133bd84767e/tensorflow_tts/models/hifigan.py#L309-L329
maybe like this:

out = [] for i, layer in enumerate(self.convs):

x = layer(x)
x = self.activation(x)
out.append(tf.reshape(x, [shape[0], -1, min(self.filters * (self.filter_scales ** (i + 1)), self.max_filters)]))

x = self.conv_post(x) x = tf.reshape(x, [shape[0], -1, self.out_filters]) out.append(x) return out

TensorSpeech / TensorFlowTTS

Wrong HiFiGAN's MPD? #773