MzeroMiko / VMamba

VMamba: Visual State Space Models,code is based on mamba
MIT License
2.06k stars 123 forks source link

How to use Vmamba as backbone #242

Open WangYuSenn opened 3 months ago

WangYuSenn commented 3 months ago

How to use Vmamba as backbone?How much is the feature channel size and image resolution of each layer? The operation after passing the layer in your code is not in line with the article。 图片

MzeroMiko commented 3 months ago

The shape is exactly described as in the arxiv paper. You can print x before the layer operation to check that.

WangYuSenn commented 3 months ago

If I use x before layer operation as output, do I not undergo Downsampling and VSSBlock operations by Stage4? In this way, can I understand Patch Partition as backbone output of layer 1? And then through stage1 as backbone's second layer output, until stage3 as backbone's fourth layer output?

MzeroMiko commented 3 months ago

The hierarchical backbone has been applied into modern networks for a long time. You can refer to the paper of swin or metaformer for more details.