akhilpm / DroneDetectron2

Pytorch code for our CVPRw 2023 paper "Cascaded Zoom-in Detector for High Resolution Aerial Images"
MIT License
52 stars 7 forks source link

Some questions about the results listed in Table1 and Table6 in the paper #38

Closed chen-chen32 closed 8 months ago

chen-chen32 commented 8 months ago

Dear Author,

In the paper, Tables 1, 2, and 6 present the model’s test results on the VisDrone dataset. Regarding these results, I have two questions:

The paper mentions, “Following the existing works, we used the validation set for evaluating the performance.” Does this imply that the results listed above are all based on the validation set, which consists of 548 pictures? It is noted in the paper that a resolution of (1.5K pixels) was adopted. Could you please clarify what specific resolution this refers to? Additionally, could you specify the input resolutions used for the results in Tables 1, 2, and 6? I look forward to your response and appreciate your assistance. Best regards!

akhilpm commented 8 months ago

Hi, Thanks for your interest in our work! To answer your questions

  1. Yes, the results are reported on the validation set consisting of 548 images.
  2. 1.5K pixels is the average width of images from the Visdrone Dataset. The config file for each setting has the info of the input resolution used for training and inference. See for example the Base Faster RCNN at inference. The images are basically resized respecting the given MIN_SIZE and MAX_SIZE constraints. Please check Detectron2 code for additional details.