WongKinYiu / PyTorch_YOLOv4

PyTorch implementation of YOLOv4
1.87k stars 585 forks source link

Question about box encoding to calculate the iou loss #226

Closed LongxingTan closed 3 years ago

LongxingTan commented 3 years ago

Thanks for the great code to share.

I have a question about the bounding box encoding to calculate the iou loss. The y_true label x, y, w, h is calculated here, so the gxy is relatively difference about the top-left of the corresponding grid, scope is (0,1). The gwh is not changed, so its scope is (0, gain). gain can be 20, 40, or 80 in normal case.

tbox.append(torch.cat((gxy - gij, gwh), 1))

To calculate the giou/ciou/diou loss, so the prediction bbox need to be decoded. so pxy after sigmoid, its scope is also (0,1). the pwh multiply the anchor_vec, anchor is divide by stride, so its scope is also (0, gain)

pxy = torch.sigmoid(ps[:, 0:2])  # pxy = pxy * s - (s - 1) / 2,  s = 1.5  (scale_xy)
pwh = torch.exp(ps[:, 2:4]).clamp(max=1E3) * anchor_vec[i]
pbox = torch.cat((pxy, pwh), 1)  # predicted box

now my question is that, in this case, it's OK to calculate the IOU directly? For sure, it's performance is awesome, must be OK. So maybe I missed something, or misunderstand something here.

Thank you.

WongKinYiu commented 3 years ago

https://github.com/WongKinYiu/ScaledYOLOv4/issues/90#issuecomment-743179052