hustvl / Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Apache License 2.0
2.55k stars 160 forks source link

Memory/speed improvements over DeiT for larger Vim #17

Open karolpustelnik opened 5 months ago

karolpustelnik commented 5 months ago

I find your paper on Vision Mamba very interesting. However, when using your code, I encountered a problem (which may well be normal behavior). When analyzing GPU memory consumption and FPS for Vim versions other than Tiny, I could not achieve similar speed and memory improvements. I compared it to DeiT, and the improvements were only visible in Vim-Ti. Am I doing something wrong, or are the improvements only in the Tiny version?