Open dwoiwode opened 2 years ago
Hi, good question.
In a classification case, the cross entropy loss (as implemented in pytorch) is equivalent to a combination of a LogSoftmax and a NLLLoss (negative log likelihood loss). Most times it is preferred to have models returns unnormalized probabilities (logits, before softmax), because combining the two into one operation can improve numerical stability.
In summary;
torch.nn.functional.cross_entropy
takes logits as inputs (performs log_softmax
internally)torch.nn.functional.nll_loss
is like cross_entropy
but takes log-probabilities (log-softmax) values as inputsHope this was helpful.
Ah, interesting. Thank you!
In the slides (w08 slide 14) it says that we should use Negative Log Likelihood loss. Tests for get_gradient seem to only work with cross_entropy loss. Did we mix something up in the slides?