nickgkan / butd_detr

Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"
Other
74 stars 11 forks source link

Question about point_object_class label #47

Closed WeitaiKang closed 7 months ago

WeitaiKang commented 7 months ago

Dear authors,

Thank you so much for your influential work. Your patience in solving so many issues do help a lot new researcher in this 3D field. As a new researcher in this field, i may want to ask a little basic question which i don't understand but find it widely-used case.

In the compute_points_obj_cls_loss_hard_topk function, i find that you assign the object class label (binary value) to just the topk closest points among all the object points and the background points. Such kind of result (objectness_label) serves as the ground truth for model's seeds_obj_cls_logits.

I am confuse about why we only treat the topk closest point as the object for seeds_obj_cls_logits, instead of just all the object points (means directly use obj_assignment_one_hot as the ground truth). Is it that because in the loss computation, we only treat one proposal to calculate loss of one target (assigned by HungarianMatcher)? If so, why top-k, instead of top-1?

ayushjain1144 commented 7 months ago

Hi,

this code is trying to match the predicted box centers to ground truth object centers. matching only the closest predicted to ground truth and supervising the rest as background can be too harsh on some proposals which are reasonably near to the prediction. letting all objects to the center would wrongly let bad predictions to survive. topk helps to have a balance between these extremes.

however, note that this is pretty empirical and works which follow DETR style matching just do 1NN matching (but they take into account multiple factors simultaneously for matching). we took the above design from group-free.

On Sun, 31 Dec, 2023, 8:52 am WeitaiKang, @.***> wrote:

Dear authors,

Thank you so much for your influential work. Your patience in solving so many issues do help a lot new researcher in this 3D field. As a new researcher in this field, i may want to ask a little basic question which i don't understand but find it widely-used case.

In the compute_points_obj_cls_loss_hard_topk https://github.com/nickgkan/butd_detr/blob/10570e0b6826d4a236b18c2c8fac5903866e1c60/models/losses.py#L161 function, i find that you assign the object class label (binary value) to just the topk closest points among all the object points and the background points. Such kind of result (objectness_label) serves as the ground truth for model's seeds_obj_cls_logits.

I am confuse about why we only treat the topk closest point as the object for seeds_obj_cls_logits, instead of just all the object points (means directly use obj_assignment_one_hot as the ground truth). Is it that because in the loss computation, we only treat one proposal to calculate loss of one target (assigned by HungarianMatcher)? If so, why top-k, instead of top-1?

— Reply to this email directly, view it on GitHub https://github.com/nickgkan/butd_detr/issues/47, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG4OHXXOCEC2IWWFPHYYROTYMDK5VAVCNFSM6AAAAABBH6OZBOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3DAOJSGM4TMMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

WeitaiKang commented 7 months ago

Thank you so much for your detailed explanation. I appreciate the time you took to clarify these concepts.