wusize / ovdet

[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection
https://openaccess.thecvf.com/content/CVPR2023/papers/Wu_Aligning_Bag_of_Regions_for_Open-Vocabulary_Object_Detection_CVPR_2023_paper.pdf
Other
172 stars 4 forks source link

Choice for different backbones #42

Open fushh opened 7 months ago

fushh commented 7 months ago

Thanks for great work!

BARON uses ResNet50-FPN as backbone when using CLIP as supervision, but uses ResNet50-C4 as backbone when using captions as supervision. I'm curious about why using different backbones for different supervisions. Why not use ResNet50-FPN when using captions?