ChenhongyiYang / PlainMamba

[BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition
Apache License 2.0
68 stars 7 forks source link

Thanks for your work! I wonder the below question: #1

Closed 924973292 closed 6 months ago

924973292 commented 7 months ago

1.Does the presence or absence of input positional encoding(PE) have an impact on models that are inherently sequential modeling like for SSM? Could please provide the result w/o PE on ImageNet-1k? 2.What's the difference between Continuous 2D Scanning and the scan methods in ZigMa? 3.Could you please provide the performance of the two methods compared with certain baseline? like: Baseline[4 directional scan without continuous ..] Baseline + Continuous 2D Scanning Baseline + Direction-Aware Updating Baseline + Continuous 2D Scanning + Direction-Aware Updating (which already have)

FanqingM commented 6 months ago

I think this "novel" Continuous 2D Scanning is just same with zigzag..... However it even not cited zigzag

ChenhongyiYang commented 6 months ago

1.Does the presence or absence of input positional encoding(PE) have an impact on models that are inherently sequential modeling like for SSM? Could please provide the result w/o PE on ImageNet-1k? 2.What's the difference between Continuous 2D Scanning and the scan methods in ZigMa? 3.Could you please provide the performance of the two methods compared with certain baseline? like: Baseline[4 directional scan without continuous ..] Baseline + Continuous 2D Scanning Baseline + Direction-Aware Updating Baseline + Continuous 2D Scanning + Direction-Aware Updating (which already have)

Hi, thank you for noticing our work. We are adding more ablation experiments in the coming version of our paper.

For Continuous 2D Scanning, it is indeed very similar to the idea proposed in the concurrent ZigMa paper. We will also mention this in the coming version.

ChenhongyiYang commented 6 months ago

I think this "novel" Continuous 2D Scanning is just same with zigzag..... However it even not cited zigzag

We note that ZigMa was submitted to arXiv on 03-20-2024, while our work was submitted on 03-25-2024. We kindly disagree that we should be blamed for failing to cite such a preprint that was only 6 days earlier than ours. (Of course, we will cite it in the coming version :) )