question about how to use vit backbone in detection

raoyongming / DenseCLIP

[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

505 stars 38 forks source link

Hi thank you for your good work.

I tried in vit structure used in the mmdetection, i notice that vit part used detection/denseclip/denseclip.py class DenseCLIP_MaskRCNN, but when I add vit config, I could not run it correctly, There will be some dimension mismatch or other incompatibility problems, it seems that the detection of vit version is not complete. Can you give me some suggestions? If I want to use vit as backbone and denseclip on coco, can I use denseclip.py from segmentation? Or can you provide vit detection configuration? Or am I using it the wrong way?

Thanks!

raoyongming / DenseCLIP

question about how to use vit backbone in detection #43