target assignment implementation

dbrazey commented 2 years ago

Hello,

Thank you for sharing your great work :)

I am learning how yolov3 works, and I have a question concerning your target assignment implementation. So my question is not directly related to your contributions.

In class RAPID, you perform iou computation twice. A first one to find which anchor (among the 9 anchors, therefore in the current grid size, and in the other ones) is responsible for detecting the GT objects. A second one to measure the correctness of predictions compared to GT boxes to set the penalty_mask.

My question is about the first iou, namely anchor_ious = iou_mask(gt_boxes, norm_anch_00wha, xywha=True, mask_size=64, is_degree=True)

where gt_boxes and norm_anch_00wha are under the form (0, 0, w, h, a) with w, h normalized between 0 and 1. I don't understand the use of x = 0 and y = 0 in the boxes. The computed iou supposes that the anchor and the GT boxes have the same center, which is not true. The GT box x, y should be set with the offset between the current grid cell and the location of the GT in the full image ?

I think that I misunderstand something, I would appreciate your help.

Thanks you for your help :)

duanzhiihao commented 2 years ago

Hi, thank you for your interest :)

In class RAPID, you perform iou computation twice. A first one to find which anchor (among the 9 anchors, therefore in the current grid size, and in the other ones) is responsible for detecting the GT objects. A second one to measure the correctness of predictions compared to GT boxes to set the penalty_mask.

Yes exactly.

The computed iou supposes that the anchor and the GT boxes have the same center, which is not true.

You have a good point. I agree that anchor boxes and GT boxes do not have the same center. However, the current algorithm still makes sense. Consider the following example:

Black lines: grid
Green: anchor boxes
Red solid: GT box
Red dashed: GT box, centered to (0,0,w,h,a)

If we do not "normalize" the GT box to (0,0,w,h,a), it is likely that the GT box cannot find a corresponding anchor that is responsible for it. Although this problem could be solved by carefully designing the anchor box size according to grid size, it requires extra human effort. In contrast, using the (0,0,w,h,a) method is easy to implement and (I guess) does not decrease performance.

This is my understanding of the way YOLO assigns anchors to GTs. Please let me know if you have other comments.

dbrazey commented 2 years ago

Thank you for your precise and fast reply !

I will think about that in the next days, I will come back if I have an other comment / question.

Have a nice weekend

duanzhiihao commented 2 years ago

No problem :)

dbrazey commented 2 years ago

Hello,

I think that the implementation is correct.

The objective of this iou computation is to determine which anchor has the closest shape compared to the gt object, in order to choose the more meaningfull one for target assignment. Therefore it makes sense to give them the same center because the location does not impact the shape comparison.

We are not comparing a gt and a prediction to know if the prediction is correct, and in that case that would have been a problem.

Please tell me if my understanding is not correct.

Thank you for your help.

duanzhiihao commented 2 years ago

I believe your understanding is correct. We only compare shapes but not locations between anchors and GTs.

We are not comparing a gt and a prediction to know if the prediction is correct

Strictly speaking, we do compare the GT and predictions in the second IoU computation (linked below). In the second IoU computation, we want to find those very good predictions and do not penalize them in the loss function. https://github.com/duanzhiihao/RAPiD/blob/bc0f2f049ac376536d8de8bf11af12973ccd18e4/models/rapid.py#L273

dbrazey commented 2 years ago

Yes I agree with your remark, and I noticed that the center has been set up correctly in the code for the second iou.

I therefore close the issue. I hope that it will help some other people one day !

Thanks for sharing your knowledge :)

duanzhiihao / RAPiD

target assignment implementation #38