NVIDIA / semantic-segmentation

Nvidia Semantic Segmentation monorepo
BSD 3-Clause "New" or "Revised" License
1.76k stars 388 forks source link

some confusions of the Relaxed Loss #50

Open chenboheng opened 4 years ago

chenboheng commented 4 years ago

Hi, I am very interested about the relaxed loss. However, the loss code confused me. In line 188 of loss.py, if batch size = 4, then border_weights=weights has shape (4, H, W) and represents the label number of 3*3 region in 4 images. In line 155. batch i image boundary loss is normalized by -1 / border_weights. Why boundary loss i should be divided label number of other images (4-1 images). I think boundary loss i = -log sum(P(C)), and this division can't get this result. Could you give some explainations?

bryanyzhu commented 4 years ago

@karansapra Could you comment on this? Thank you.

bryanyzhu commented 4 years ago

Hi, in my understanding, it is not divided by batch size, it is normalized over number of classes contained in the boundary pixels.

As you can see line 172 in loss.py,

weights = target[:, :-1, :, :].sum(1).float()

The second dimension is the class dimension, which is 19 classes for Cityscapes. If we take the summation, the boundary pixels will have multiple classes, e.g., >2. Hence, when you compute the final loss, we want to normalize over the multiple classes.

Maybe my explanation is not clear enough, but keep in mind that we transformed the ground truth label when we use the relaxed loss, you can look at this function for more details. https://github.com/NVIDIA/semantic-segmentation/blob/master/transforms/transforms.py#L74

@karansapra Please correct me if I'm wrong, feel free to add more explanations.

karansapra commented 4 years ago

@bryanyzhu is correct. It is normalization. We modify the target https://github.com/NVIDIA/semantic-segmentation/blob/master/transforms/transforms.py#L74

Let me know if this doesnt clarify.

Sorry for the delay.

wy3406 commented 3 years ago

@bryanyzhu Hi,I used gluon-cv to rewrite the ImgWtLossSoftNLL function. Why use border_weights of all batchsizes for normalization instead of one border_weights[i]? The result of training is that Loss increases as the batchsize becomes larger

class ImgWtLossSoftNLL(gluon.HybridBlock):
    """
    Relax Loss

    """
    def __init__(self, classes=19, weight=None, size_average=True, ignore_label=-1,
                 nll_loss=SoftmaxCrossEntropyLoss,
                 norm=False, upper_bound=1.0,**kwargs):
        super(ImgWtLossSoftNLL, self).__init__()
        self.num_classes = classes
        self.batch_weights = False
        self.ignore_label=ignore_label
        self.norm = norm
        self.upper_bound = upper_bound

    def customsoftmax(self,F,inp, multihotmask):
        """
        Custom Softmax
        """
        soft = F.softmax(inp)
        # This takes the mask * softmax ( sums it up hence summing up the classes in border
        # then takes of summed up version vs no summed version
        return F.log(
            F.broadcast_maximum(soft, (multihotmask * (soft * multihotmask).sum(1, keepdims=True)))
        )

    def calculate_weights(self, target):
        """
        Calculate weights of the classes based on training crop
        """
        if len(target.shape) == 3:
            hist = np.sum(target, axis=(1, 2)) * 1.0 / target.sum()
        else:
            hist = np.sum(target, axis=(0, 2, 3)) * 1.0 / target.sum()
        if self.norm:
            hist = ((hist != 0) * self.upper_bound * (1 / hist)) + 1
        else:
            hist = ((hist != 0) * self.upper_bound * (1 - hist)) + 1
        return hist[:-1]

    def custom_nll(self,F, inputs, target, class_weights, border_weights, mask):
        """
        NLL Relaxed Loss Implementation
        """
        # if self.REDUCE_BORDER_EPOCH != -1 and \
        #    self.EPOCH > self.REDUCE_BORDER_EPOCH:
        #     border_weights = 1 / border_weights
        #     target[target > 1] = 1

        wts = class_weights.expand_dims(0).expand_dims(2).expand_dims(3)

        smax = self.customsoftmax(F,inputs, target[:, :-1, :, :])
        loss_matrix = (-1 / border_weights *
                       (target[:, :-1, :, :]*
                        wts * smax).sum(1)) * (1. - mask)

        loss = loss_matrix.sum()

        # +1 to prevent division by 0
        loss = loss / (target.shape[0] * target.shape[2] * target.shape[3] -
                       mask.sum() + 1)
        return loss

    def hybrid_forward(self,F, inputs, target):
        ##targets:[B,Classes+1,H,W]
        #zero_grad()
        #pdb.set_trace()
        weights = target[:, :-1, :, :].sum(1)
        ignore_mask = (weights == 0)
        weights=F.where(ignore_mask,F.ones_like(weights),weights)

        #loss = 0
        target_cpu = target.asnumpy()

        if self.batch_weights:
            class_weights = self.calculate_weights(target_cpu)
        losses=list()
        for i in range(0, inputs.shape[0]):
            if not self.batch_weights:
                class_weights = self.calculate_weights(target_cpu[i])
            nll_loss = self.custom_nll(F,
                inputs[i].expand_dims(0),
                target[i].expand_dims(0),
                class_weights=F.array(class_weights,dtype=inputs.dtype,ctx=inputs.context),
                border_weights=weights, mask=ignore_mask[i])  
            losses.append(nll_loss)###Here border_weights=weights should be written as border_weights=weights[i]

        return F.concat(*losses,dim=0).sum()
ggjy commented 3 years ago

I have the same question with @wy3406, and should the returned loss be loss / inputs.shape[0] ? Otherwise the loss increases as the batchsize becomes larger.

Seyoung9304 commented 1 year ago

I have same question with @wy3406 !