Open abdulrahimq opened 3 years ago
Thanks! The content of the linked post is the following.
It usually depends on the problem you are trying to solve. For a binary classification problem, you might want to have a sigmoid activation function for the last layer (or softmax in case of a multiclass problem), so that you can get an estimate of the probability of your input belonging to the specified class(es). But in case of regression, there may not be a need for any final activation function because we want our network to predict a continuous range of values and not something that is restricted to a range like
(0,1)
.Also, it may depend on the cost function you use because certain loss functions in PyTorch combine the final non linearity inside their own implementation, so we can avoid defining a final activation function explicitly in our network (eg.
CrossEntropyLoss
combinesLogSoftmax
andNLLoss
).
We can either add this to the blog or remove it altogether. What is your opinion?
Sorry I haven't seen the comment. I think the better one is to add the text to the blog. I can do that if you are still interested?
Yes, please, go ahead. Then we have to track this across all languages. Comment to issue #144.
1- If you go to https://atcold.github.io/pytorch-Deep-Learning/en/week02/02-3/ 2-Search for "The activation of the last layer in general would depend on your use case, as explained in this Piazza post." 3- The link doesn't work.