keon / seq2seq

Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch
MIT License
689 stars 172 forks source link

A bug of the Loss function in the def 'train' and 'evaluate' #4

Closed ruizheliUOA closed 6 years ago

ruizheliUOA commented 6 years ago

Hi, I found the loss function in the def 'train' and 'evaluate' is cross-entropy. But in the model.py, the output from the decoder is operated by a log-softmax function. According to the definition of cross_entropy, the log_softmax operation has been included in the cross_entropy. I think the loss function in 'train' and 'evaluate' might be nll-loss. Then the entire loss function containing final operation of the decoder and nll_loss is the cross_entropy.

pskrunner14 commented 6 years ago

Yeah @ruizheliUOA I read the same

keon commented 6 years ago

Thanks for pointing it out. You are right. I haven't been able to change it due to my laziness :) I will change it someday....

pskrunner14 commented 6 years ago

@keon happy to help :)