Got a very low MIoU after simply swapping out the cross entropy loss for "lovasz_softmax"

bermanmaxim / LovaszSoftmax

Code for the Lovász-Softmax loss (CVPR 2018)

http://bmax.im/LovaszSoftmax

MIT License

1.38k stars 269 forks source link

Got a very low MIoU after simply swapping out the cross entropy loss for "lovasz_softmax" #20

Closed Tensorfengsheng1926 closed 5 years ago

Tensorfengsheng1926 commented 5 years ago

Hello, nice to read this paper. I have encountered the problem that I got a very low miou(0.003) from Deeplabv3+ with Lovasz_softmax. It can normally achieve miou=76% using cross entropy loss. Environment: pytorch 1.0 Ubuntu 16.04 batch size: 10 dataset: Pascal VOC 2012 (aug) loaded ImageNet pretrained ResNet-101 weight

And here is the code of Lovasz softmax:

class LovaszSoftmax(nn.Module):
    def __init__(self, per_image=False):
        super(LovaszSoftmax, self).__init__()
        self.lovasz_softmax = lovasz_softmax
        self.per_image = per_image

    def forward(self, pred, label):
        pred = F.softmax(pred, dim=1)
        return self.lovasz_softmax(pred, label, per_image=self.per_image, ignore=255)

Thanks!

bermanmaxim commented 5 years ago

There can be some hyperparameters to adjust when optimizing with LL alone (compared to e.g. LL + lovasz_softmax), but 0.003 mIoU clearly points to a problem somewhere.

How big is your batch size?
Can you try with only_present = True?
How does the value of the loss behave during the optimization?

Tensorfengsheng1926 commented 5 years ago

Thanks a lot for your advice! I have set only_present = True and it seems to work fine.

loss

Now I want to know why it works after setting this parameter. Would you like to explain it? Thanks again!

bermanmaxim commented 5 years ago

The Lovasz-Softmax approximates the IoU computed over 1 batch and not computed over the whole dataset, this can be problematic for small batches, especially for classes absent in the current batch. In practice this can lead to your output converging towards a prediction of only the background, or not converging. Only_present only counts the IoU over the classes present in the batches which mitigates this.

JohnMBrandt commented 5 years ago

@bermanmaxim how would you implement this for the binary case, where the option is only to set classes=[1] in lovasz_softmax -- I seem to be having this issue of unstable loss due to images that only have the background class in some batches.

ZM-J commented 3 years ago

What is the usage of only_present? I only see classes='present'.