Open nationalflag opened 6 years ago
Can you try to run python focalloss_test.py
?
But when i replace CrossEntropyLoss with FocalLoss to train my network, the corresponding trainning loss is lower.
I use the same value of alpha of FocalLoss and weight of CrossEntropyLoss ([0.366, 1]), the max_error is 0.12252533435821533.
i rewrite the FocalLoss code according to my experiments and i meet the same problem as you.
class BCFocalLoss(nn.Module):
def __init__(self, gamma=0, alpha=None, size_average=True):
super(BCFocalLoss, self).__init__()
self.gamma = gamma
self.alpha = alpha
if isinstance(alpha,(float,int)): self.alpha = torch.Tensor([alpha,1-alpha])
if isinstance(alpha,list): self.alpha = torch.Tensor(alpha)
self.size_average = size_average
def forward(self, input, target):
if input.dim()>2:
input = input.view(input.size(0),input.size(1),-1) # N,C,H,W => N,C,H*W
#input = input.transpose(1,2) # N,C,H*W => N,H*W,C
#input = input.contiguous().view(-1,input.size(2)) # N,H*W,C => N*H*W,C
input = input.contiguous().view(-1)
#target = target.view(-1,1)
target = target.view(-1)
#logpt = F.log_softmax(input)
#logpt = F.sigmoid(input)
logpt = F.logsigmoid(input)
#logpt = logpt.gather(1,target)
#logpt = logpt.view(-1)
pt = Variable(logpt.data.exp())
if self.alpha is not None:
if self.alpha.type()!=input.data.type():
self.alpha = self.alpha.type_as(input.data)
at = self.alpha.gather(0,target.data.view(-1))
logpt = logpt * Variable(at)
loss = -1 * (1-pt)**self.gamma * logpt
if self.size_average: return loss.mean()
else: return loss.sum()
when gamma=0 it works as BCEWithLogitsLoss in my test code.But when i replace BCEWithLogitsLoss with BCFocalLoss to train my network, the corresponding trainning loss is much lower.Here is the test code:
torch.random.manual_seed(32)
p=torch.randn(1,1,56,56,requires_grad=True).type(torch.float32)
t=torch.ones(1,1,56,56).type(torch.float32)
#c=torch.sigmoid(p)
#result=F.binary_cross_entropy(c,t)
criterion=BCFocalLoss()
#result=criterion(p,t)
result=criterion(p,t)
result2=nn.BCEWithLogitsLoss()(p,t)
result.backward()
#result2.backward()
print(p.grad.sum())
same problem have you solved that @nationalflag @clcarwin @huaifeng1993
same problem
If you guys find that the loss using CE-loss is much lower than that using Focal-loss, you can try like below: logpt=-F.cross_entropy(input,target.view(-1),reduction="none") the item " reduction="none" " is important.
In my experiments, the the loss of FocalLoss with gamma=0 is much lower than the loss of CrossEntropyLoss. What makes it?