automl-classroom / iML-ws21-ex08

0 stars 0 forks source link

Wrong loss function for get_gradients? #1

Open dwoiwode opened 2 years ago

dwoiwode commented 2 years ago

In the slides (w08 slide 14) it says that we should use Negative Log Likelihood loss. Tests for get_gradient seem to only work with cross_entropy loss. Did we mix something up in the slides?

maxidl commented 2 years ago

Hi, good question.

In a classification case, the cross entropy loss (as implemented in pytorch) is equivalent to a combination of a LogSoftmax and a NLLLoss (negative log likelihood loss). Most times it is preferred to have models returns unnormalized probabilities (logits, before softmax), because combining the two into one operation can improve numerical stability.

In summary;

Hope this was helpful.

dwoiwode commented 2 years ago

Ah, interesting. Thank you!