aim-uofa / AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
https://git.io/AdelaiDet
Other
3.37k stars 646 forks source link

Is there something wrong with the calculation of GIoU loss? #115

Open kzs569 opened 4 years ago

kzs569 commented 4 years ago

I find the calculation of GIoU in adet/layers/iou_loss.py

        pred_left = pred[:, 0]
        pred_top = pred[:, 1]
        pred_right = pred[:, 2]
        pred_bottom = pred[:, 3]

        target_left = target[:, 0]
        target_top = target[:, 1]
        target_right = target[:, 2]
        target_bottom = target[:, 3]

        target_aera = (target_left + target_right) * \
                      (target_top + target_bottom)
        pred_aera = (pred_left + pred_right) * \
                    (pred_top + pred_bottom)

        w_intersect = torch.min(pred_left, target_left) + \
                      torch.min(pred_right, target_right)
        h_intersect = torch.min(pred_bottom, target_bottom) + \
                      torch.min(pred_top, target_top)

        g_w_intersect = torch.max(pred_left, target_left) + \
                        torch.max(pred_right, target_right)
        g_h_intersect = torch.max(pred_bottom, target_bottom) + \
                        torch.max(pred_top, target_top)
        ac_uion = g_w_intersect * g_h_intersect

        area_intersect = w_intersect * h_intersect
        area_union = target_aera + pred_aera - area_intersect

        ious = (area_intersect + 1.0) / (area_union + 1.0)
        gious = ious - (ac_uion - area_union) / ac_uion

from the computing process of target area

target_aera = (target_left + target_right) * (target_top + target_bottom)

that target_left, target_right, target_top and target_bottom refers the distance from centerness of bounding box to left side, right side, top side, and bottom of it.

But as for GIoU(https://arxiv.org/pdf/1902.09630.pdf), we need to find a rectangle that can cover both target bounding box and predicted bounding box. In the paper, the min and max operation is operated on the coordinate of (x,y,w,h) or (x1, y1, x2, y2). In the code above, the procedure is amost the same. But you left, right, top and bottom is not the coordinate. Is there something wrong with the GIoU Loss? Or am I misunderstood your code?

xiaohu2015 commented 4 years ago

@kzs569 I think the code is right, (target_left + target_right) = w, (target_top + target_bottom) = h

tensorboy commented 3 years ago

I've same confusions @kzs569