MzeroMiko / VMamba

VMamba: Visual State Space Models,code is based on mamba
MIT License
2.05k stars 121 forks source link

Some problems about the architecture(down sampling) #146

Open HongyuZhu999 opened 5 months ago

HongyuZhu999 commented 5 months ago

I'm curious as to why down sampling is used, because it will reduce data infomation. Would it be better if I didn't use it?

MzeroMiko commented 5 months ago

It's a cute question. I just inherit the code from Swin-Transformer and using the architecture.

You can try the architecture of plian ViT to check whether VMamba works under that structure.