hustvl / Vim

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Apache License 2.0
2.55k stars 159 forks source link

Object detection and instance segmentation #78

Open 051511wang opened 1 month ago

051511wang commented 1 month ago

Hi,thank you for your open source classification task. As for the experiment of object detection task in the paper, I am very confused. In the appendix, it only mentioned "using standard Cascade Mask R-CNN as the basic framework", then how did you add Vim to the framework , only replacing backbone from ResNet to Vim? Please spare your precious time to give me some advice and guidance, thank you!