xhanxu / Mamba3D

[ACM MM 2024] Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model
https://xhanxu.github.io/
72 stars 7 forks source link

Questions about biSSM #1

Closed Chuxwa closed 5 months ago

Chuxwa commented 5 months ago

Hi, I read your paper and like your work. But I'm not sure if L+SSM and C-SSM are computed along the token dimension (L axis)?

xhanxu commented 5 months ago

Glad to hear that you like our work. Yes, they are not computed along the L axis, but rather on the C axis. To help you understand better, as illustrated in Figure 3, L+SSM is equivalent to C+SSM, and together with C-SSM, they are computed simultaneously along the C axis. Then, following Mamba, they are scanned in the order of L+. We use L+ as a means to differentiate from token flipping (L-), which compels the model to learn information about the forward and reverse order of tokens. We believe that such information possesses pseudo-order dependency. Does this answer your question? Sorry if the paper causes any misunderstanding.

Chuxwa commented 5 months ago

Thanks for your reply. In BiSSM, both L+SSM and C-SSM are scanned in the order of L+. This solved my problem.