SysCV / cascade-detr

[ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection
https://arxiv.org/abs/2307.11035
Apache License 2.0
93 stars 4 forks source link

Question about equation (2) in the paper. #7

Open toffeecat opened 9 months ago

toffeecat commented 9 months ago

Hi, I appreciate your work on Cascade DETR. However, I am a bit confused by the description of the predicted bounding boxes in your statement: "Si is the set of 2D locations inside the predicted bounding box Bi from the preceding decoder layer i." Do you mean the predicted box for all queries, or those boxes matched to groundtruth? Since the predicted bounding box Bi of each decoder layer should be of size (Number_of_queries, 4), which is not yet matched by the Hungarian Matcher. As far as I know, the matcher is only used once before calculating the criterion. So, does that mean to achieve the goal of cascade attention, we need to apply matching to the box result of each decoder layer? Looking forward to your reply.