Closed notfacezhi closed 1 year ago
Hi~ We divide the queries into three groups, each responsible for objects with relative scales in the range (0, 0.2], (0.2, 0.4] and (0.4, 1], respectively. Here is how (0, 0.2], (0.2, 0.4] and (0.4, 1] are derived: Generally, images are scaled and padded into 640×640 when training the COCO dataset. We firstly define three absolute scale ranges (0, 128(2^7)], (128(2^7), 256(2^8)] and (256(2^8), 640], and then convert them into ones relative to the image size: (0/640, 128/640], (128/640, 256/640], (256/640, 640/640] -> (0, 0.2], (0.2, 0.4], (0.4, 1]. The value selection needs some trials, and we found the aforementioned value combination the best for training COCO.
Thank you for your reply, I still have some questions. Regarding this paper,
Preference extraction part, Ai, that is, the new anchor box is directly obtained through preference extraction. In the next epoch, the new anchor box is directly used as input, and then a query is generated for training, right?
Your understanding of both questions is correct ~
Hello, author, I would like to ask how to choose the maximum and minimum values of the defined K scale ranges mentioned in the paper in each group?