Remove the last normalization layer in Postnet.

as-ideas / TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Other

1.13k stars 227 forks source link

The last normalization layer should be removed in Postnet, because the mel_linear and final_output must be close enough before and after normalization, this will become a paradox when training at the same time and make the Res module loss effect.

Here shows the loss curves when training with/without norm layer:

20210301195256 20210301183002

Refer to the codes by NVIDIA's: https://github.com/NVIDIA/tacotron2/blob/185cd24e046cc1304b4f8e564734d2498c6e2e6f/model.py#L141-L144 https://github.com/NVIDIA/tacotron2/blob/185cd24e046cc1304b4f8e564734d2498c6e2e6f/model.py#L510-L511

as-ideas / TransformerTTS

Remove the last normalization layer in Postnet. #86