Closed as595 closed 4 years ago
Hi, it's been a little while since I looked at this code - but I'll take a look. Thanks for bringing it up!
Thanks @as595 for pointing this out, I'll fix it later today - or you're more than welcome to submit a PR.
It appears the softmax
in the output layer is unnecessary:
In the train()
function, this is effectively getting used twice:
Code for the test()
function will need to change slightly:
Predictions are based on the logits passed through softmax
so the code will need to be amended to ensure the test_loss
is calculated correctly.
The model passes the logits through softmax in the output layer, but then the F.cross_entropy() function is used which combines log_softmax and nll. I think maybe the softmax in the output layer is not necessary?