Closed EdwinB12 closed 1 year ago
Small question:
In Lab 2.0, Exercise 2.1 instructs to use softmax activation in final layer and therefore miss out from_logits=True in the loss function.
from_logits=True
However, in the solutions to Exercise 2.3, it switches to including no softmax activation but including from_logits=True in the loss function.
Is this done deliberately to show two different ways of doing the ~same thing?
On a side note, I have found some posts suggesting from_logits is more stable but I suspect this isn't really a consideration for this course.
Resolved in 2afeaa37726b3736cae32d462657f85e6cef8941
Good spot, we do start moving to using logits from lab 3 but Exercise 2.3 should not be using it.
Small question:
In Lab 2.0, Exercise 2.1 instructs to use softmax activation in final layer and therefore miss out
from_logits=True
in the loss function.However, in the solutions to Exercise 2.3, it switches to including no softmax activation but including
from_logits=True
in the loss function.Is this done deliberately to show two different ways of doing the ~same thing?
On a side note, I have found some posts suggesting from_logits is more stable but I suspect this isn't really a consideration for this course.