Wrong loss function for get_gradients?

dwoiwode commented 2 years ago

In the slides (w08 slide 14) it says that we should use Negative Log Likelihood loss. Tests for get_gradient seem to only work with cross_entropy loss. Did we mix something up in the slides?

maxidl commented 2 years ago

Hi, good question.

In a classification case, the cross entropy loss (as implemented in pytorch) is equivalent to a combination of a LogSoftmax and a NLLLoss (negative log likelihood loss). Most times it is preferred to have models returns unnormalized probabilities (logits, before softmax), because combining the two into one operation can improve numerical stability.

In summary;

torch.nn.functional.cross_entropy takes logits as inputs (performs log_softmax internally)
torch.nn.functional.nll_loss is like cross_entropy but takes log-probabilities (log-softmax) values as inputs

Hope this was helpful.

dwoiwode commented 2 years ago

Ah, interesting. Thank you!

automl-classroom / iML-ws21-ex08

Wrong loss function for get_gradients? #1