Looking throughh the source code for torch.nn.CrossEntropyLoss(), it seems to be doing log_softmax inside it. The model is also doing a log_softmax at the end. Isn't this a duplicate? Is it needed?
From model.py:
def forward(self, x):
...
# flatten for input to fully-connected layer
x = x.view(x.size(0), -1)
x = self.fc(x)
return F.log_softmax(x, dim=1)
From Pytorch source code:
class CrossEntropyLoss(_WeightedLoss):
...
def __init__(self, weight=None, size_average=None, ignore_index=-100,
reduce=None, reduction='elementwise_mean'):
super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
self.ignore_index = ignore_index
def forward(self, input, target):
return F.cross_entropy(input, target, weight=self.weight,
ignore_index=self.ignore_index, reduction=self.reduction)
def cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100,
reduce=None, reduction='elementwise_mean'):
...
if size_average is not None or reduce is not None:
reduction = _Reduction.legacy_get_string(size_average, reduce)
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
Looking throughh the source code for torch.nn.CrossEntropyLoss(), it seems to be doing log_softmax inside it. The model is also doing a log_softmax at the end. Isn't this a duplicate? Is it needed?
From model.py:
From Pytorch source code: