Tomiinek / Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
MIT License
826 stars 157 forks source link

No softmax layer in the classifier? #73

Closed jerryuhoo closed 2 years ago

jerryuhoo commented 2 years ago

https://github.com/Tomiinek/Multilingual_Text_to_Speech/blob/5cddc8b0531e3102cf4fb3acb0d04d2713b5b3da/modules/classifier.py#L47-L55

Hi, I saw there's a softmax layer after the hidden layer in your paper, but the code doesn't have it. Does the softmax layer matter?

Tomiinek commented 2 years ago

Hey, it does not matter. There is the cross-entropy from logits calculation which is a numerically stable implementation of log softmax and negative log likelihood (see https://pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html#torch.nn.functional.cross_entropy): https://github.com/Tomiinek/Multilingual_Text_to_Speech/blob/5cddc8b0531e3102cf4fb3acb0d04d2713b5b3da/modules/classifier.py#L69

And we never do inference with the classifier.