I have a question about the bounding box encoding to calculate the iou loss.
The y_true label x, y, w, h is calculated here, so the gxy is relatively difference about the top-left of the corresponding grid, scope is (0,1). The gwh is not changed, so its scope is (0, gain). gain can be 20, 40, or 80 in normal case.
tbox.append(torch.cat((gxy - gij, gwh), 1))
To calculate the giou/ciou/diou loss, so the prediction bbox need to be decoded. so pxy after sigmoid, its scope is also (0,1).
the pwh multiply the anchor_vec, anchor is divide by stride, so its scope is also (0, gain)
now my question is that, in this case, it's OK to calculate the IOU directly?
For sure, it's performance is awesome, must be OK. So maybe I missed something, or misunderstand something here.
Thanks for the great code to share.
I have a question about the bounding box encoding to calculate the iou loss. The y_true label x, y, w, h is calculated here, so the gxy is relatively difference about the top-left of the corresponding grid, scope is (0,1). The gwh is not changed, so its scope is (0, gain). gain can be 20, 40, or 80 in normal case.
To calculate the giou/ciou/diou loss, so the prediction bbox need to be decoded. so pxy after sigmoid, its scope is also (0,1). the pwh multiply the anchor_vec, anchor is divide by stride, so its scope is also (0, gain)
now my question is that, in this case, it's OK to calculate the IOU directly? For sure, it's performance is awesome, must be OK. So maybe I missed something, or misunderstand something here.
Thank you.