r9y9 / tacotron_pytorch

PyTorch implementation of Tacotron speech synthesis model.
http://nbviewer.jupyter.org/github/r9y9/tacotron_pytorch/blob/master/notebooks/Test%20Tacotron.ipynb
Other
306 stars 79 forks source link

How are linear targets being passed to the model? #7

Closed 7404N closed 6 years ago

7404N commented 6 years ago

This might be a really dumb question....but I am not sure I understand why you are not passing the linear targets to the model here? https://github.com/r9y9/tacotron_pytorch/blob/master/train.py#L240 Are you using a pre-trained post-net in your implementation?

r9y9 commented 6 years ago

According to the Tacotron paper, PostNet transforms mel-spectrogram to linear-frequency log amplitude spectrogram. I think the code looks okay. I'm not using pre-trained PostNet.

Linear targets are used to compute L1 loss with predicted linear outputs. See https://github.com/r9y9/tacotron_pytorch/blob/5f41d9d8f70a299a49f02aa5422478e1693ebe93/train.py#L246-L248.

7404N commented 6 years ago

Oh right! sorry makes sense! Thanks!