horrible-dong / TeamDETR

[ICIP 2023 (oral)] Team DETR: Guide Queries as a Professional Team in Detection Transformers
https://arxiv.org/pdf/2302.07116.pdf
Apache License 2.0
18 stars 2 forks source link

How to determine the scale range of the limit values for each group? #2

Closed notfacezhi closed 1 year ago

notfacezhi commented 1 year ago

image Hello, author, I would like to ask how to choose the maximum and minimum values of the defined K scale ranges mentioned in the paper in each group?

horrible-dong commented 1 year ago

Hi~ We divide the queries into three groups, each responsible for objects with relative scales in the range (0, 0.2], (0.2, 0.4] and (0.4, 1], respectively. Here is how (0, 0.2], (0.2, 0.4] and (0.4, 1] are derived: Generally, images are scaled and padded into 640×640 when training the COCO dataset. We firstly define three absolute scale ranges (0, 128(2^7)], (128(2^7), 256(2^8)] and (256(2^8), 640], and then convert them into ones relative to the image size: (0/640, 128/640], (128/640, 256/640], (256/640, 640/640] -> (0, 0.2], (0.2, 0.4], (0.4, 1]. The value selection needs some trials, and we found the aforementioned value combination the best for training COCO.

notfacezhi commented 1 year ago

Thank you for your reply, I still have some questions. Regarding this paper,

  1. image when grouping queries, it should be judged logically. According to the different scales of the targets in the picture, let different groups of queries do Hungarian matching according to the obj scale, then Hungarian matching should be done within each group, right?
  2. image Preference extraction part, Ai, that is, the new anchor box is directly obtained through preference extraction. In the next epoch, the new anchor box is directly used as input, and then a query is generated for training, right?

horrible-dong commented 1 year ago

Your understanding of both questions is correct ~