kyegomez / VisionMamba

Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images
https://discord.gg/GYbXvDGevY
MIT License
395 stars 19 forks source link

forward-backward ssm #3

Closed swarajnanda2021 closed 6 months ago

swarajnanda2021 commented 9 months ago

It appears, per algorithm 1 of the vision mamba paper, the state-space model runs bi-directionally along the sequence. But in this implementation I see that both forward and backward convolutions are standard 1D convolutions, rather than going backward and forward in space. Can you explain your rationale behind this?

Because otherwise, there seems to be no difference between the backward and forward operations, which would otherwise be providing the mamba block a bi-directional way of applying the state-space selection operation.

Upvote & Fund

Fund with Polar

github-actions[bot] commented 9 months ago

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

kyegomez commented 9 months ago

@swarajnanda2021 yes I did this for now until I can understand what forward and backward into space means

swarajnanda2021 commented 9 months ago

Thanks for the fast response. Per the stencil operation used by most practitioners, it seems to me this is just a forward and backward for-looping in the scan process. Concur?

github-actions[bot] commented 7 months ago

Stale issue message

thucz commented 3 months ago

So the forward and backward SSM have no difference in fact?