LiewFeng / imTED

[ICCV 2023] Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection
https://arxiv.org/abs/2205.09613
Apache License 2.0
66 stars 8 forks source link

Codes for the baseline (ViT based Faster R-CNN) #14

Closed lovelyqian closed 1 year ago

lovelyqian commented 1 year ago

Hi, very thanks for releasing the codes for imTED.

I am very interested in the comparison between the ViT-based Faster R-CNN and the imTED, however, I did not find the codes for the baseline. Could you kindly provide the codes related to the baseline e.g., the model and the training? Thanks for your time and consideration!

LiewFeng commented 1 year ago

Hi,@lovelyqian, thank you for your interest. We reconstruct our code and the original config of the basline with Faster RCNN + ViT-B is not that compatible with the current code. You may reproduce it by

  1. replace the config of roi_head with the commanly used StandardROIHead
  2. replace the config of bboxhead with ConvFCBBoxHead
  3. comment this line
  4. set with_mfm as False
lovelyqian commented 1 year ago

Hi, @LiewFeng very thanks for your kind reply. I will try it!

LiewFeng commented 1 year ago

Closed due to inactivity. Feel free to reopen it if necessary.