HLTCHKUST / PAML

Personalizing Dialogue Agents via Meta-Learning
MIT License
126 stars 24 forks source link

Applying F.log_softmax on Generator otuput #11

Closed robinsongh381 closed 4 years ago

robinsongh381 commented 4 years ago

Hi

You have applied F.log_sofmax on the output of projection layer in [line 232] (https://github.com/HLTCHKUST/PAML/blob/3c1fe4e55956b74fe0682b431726e5396f8db490/model/transformer.py#L232).

If we use nn.CrossEntropy for the loss function, the result of F.log_softmax enters in the loss function as in [line 333] (https://github.com/HLTCHKUST/PAML/blob/3c1fe4e55956b74fe0682b431726e5396f8db490/model/transformer.py#L333)

So basically the output of the projection layer goes through F.log_softmax and then nn.CrossEntropy.

However, if you look at here, simply applying nn.CrossEntropy would automatically apply F.log_softmax so I think you should exlude the line 232 and instead just return [line 218] (https://github.com/HLTCHKUST/PAML/blob/3c1fe4e55956b74fe0682b431726e5396f8db490/model/transformer.py#L218).

What do you think ?

zlinao commented 4 years ago

indeed we use self.criterion = nn.NLLLoss(ignore_index=config.PAD_idx) instead of nn.CrossEntropy

zlinao commented 4 years ago

you can also use logits + CrossEntropy