LabelSmooth, SoftTargetCrossEntropy理解

rentainhe commented 2 years ago

参考文章

CrossEntropy = Log + Softmax + NLL_Loss

import torch
import torch.nn.functional as F
torch.manual_seed(1)

def softmax(tensor, dim=1):
    tensor = torch.exp(tensor)
    sum = torch.sum(tensor, dim).rehsape(-1, 1)
    tensor = tensor / sum
    return tensor

def log_softmax(tensor,dim=1):
    return torch.log(softmax(tensor,1))

def nll_loss(tensor, label):
    one_hot = torch.nn.functional.one_hot(label, 10)
    loss = -torch.sum(one_hot * tensor)/tensor.shape[0]
    return loss

def cross_entropy(tensor,label):
    tensor = log_softmax(tensor,1)
    tensor = nll_loss(tensor,label)
    return tensor

# 模拟一下MNIST输出和标签
out = torch.randn([2,10])
label = torch.randint(0,10,[2])

#pytorch impl
print(F.cross_entropy(out, label).item()) # 3.927539587020874
#our impl
print(cross_entropy(out, label).item())   # 3.927539825439453

rentainhe commented 2 years ago

NLL Loss

# nll loss会取出target对应index的值，然后求和后作sum或者mean操作后求负数
predict = torch.Tensor([[2, 3, 1],
                        [3, 7, 9]])
label = torch.tensor([1, 2])  # 第一个sample对应index=1的位置为3， 第二个sample对应index为2的值为9
loss = F.nll_loss(predict, label)
print(loss)  # tensor(-6.)

>>> torch.softmax(predict, dim=-1)
tensor([[0.2447, 0.6652, 0.0900],
        [0.0022, 0.1189, 0.8789]])

>>> torch.log(torch.softmax(predict, dim=-1))
tensor([[-1.4076, -0.4076, -2.4076],
        [-6.1291, -2.1291, -0.1291]])

>>> predict = torch.log(torch.softmax(predict, dim=-1))
>>> F.nll_loss(predict, label)
tensor(0.2684)

作softmax操作后，所有的预测prob都被归一化到0-1区间，取-log后，预测的值越接近正确target，loss越小，并且如果越远离target，loss越大，惩罚力度也越大

rentainhe commented 2 years ago

Label Smooth推导

https://blog.csdn.net/qq_27182145/article/details/108509227

Y_pred 为预测的结果
Y_true 为真实标签
类别数量为 N
平滑因子为 ϵ

rentainhe / what_I_have_read

LabelSmooth, SoftTargetCrossEntropy理解 #21

参考文章

CrossEntropy = Log + Softmax + NLL_Loss

NLL Loss

Label Smooth推导

One-Hot中的编码处理

带Label-Smooth的交叉熵损失函数计算