What is the difference between 'head' and 'intermediate' in 'obj_embedding_head'?

amirbar / DETReg

Official implementation of the CVPR 2022 paper "DETReg: Unsupervised Pretraining with Region Priors for Object Detection".

https://amirbar.net/detreg

Apache License 2.0

336 stars 46 forks source link

What is the difference between 'head' and 'intermediate' in 'obj_embedding_head'? #50

Closed Cohesion97 closed 2 years ago

Cohesion97 commented 2 years ago

https://github.com/amirbar/DETReg/blob/0a258d879d8981b27ab032b83defc6dfcbf07d35/models/backbone.py#L156-L177

It seems 'head' is the new training setting that uses dim=128 to align features. But dim=512 ('intermediate') is used in the paper. Does it mean that we should change to dim=128 ('head') to achieve better performance of DETReg?

Thanks.

amirbar commented 2 years ago

Hi, we've explored both options. The default option (used in the paper) is using an average over a SwAV ResNet50 intermediate feature map. Another option is using SwAV projector feature vector output. Holding everything else fixed, both options perform similarly well.