cvlab-stonybrook / LearningToCountEverything

MIT License
357 stars 72 forks source link

About the baselines in your CVPR2021 paper #13

Closed zhiyuanyou closed 2 years ago

zhiyuanyou commented 2 years ago

Hello, in your paper, you compare FamNet with object detectors including: Faster RCNN, RetinaNet, and Mask RCNN in Table 2. In my view, Mask RCNN is very similar to Faster RCNN with an additional segmentation head for semantic segmentation. I wonder the main differences between Faster RCNN and Mask RCNN in your experiments. 企业微信截图_16351600676080

Viresh-R commented 2 years ago

Hey, As stated in the paper, we use the pretrained Faster RCNN, RetinaNet and Mask RCNN from the detectron2 library for this experiment. So the main difference between Faster RCNN and Mask RCNN would be the mask prediction branch in Mask RCNN. Note that for our counting experiments, we only use the classification and bounding box regression branches, and ignore the mask prediction branch of Mask RCNN.

zhiyuanyou commented 2 years ago

Thanks for your response. I have done some work following your work and I plan to submit my paper to CVPR2022. However, I am a little confused when I select the Subject Areas. Could you please tell me your selected Subject Areas when you submitted your paper to CVPR2021? Now I select Transfer/ low-shot/ long-tail learning as Primary, select Scene analysis and understanding and Vision applications and systems as Secondary.

企业微信截图_16353139653966

企业微信截图_1635314210478

Viresh-R commented 2 years ago

From what I remember, we had picked Face and gesture ( since "Face" seemed like a relevant area for crowd counting) and Transfer/Low-shot as the primary and secondary.