xhanxu / Mamba3D

[ACM MM 2024] Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model
https://xhanxu.github.io/
72 stars 7 forks source link

Questions about LNP #4

Closed Chisato-Sophia closed 5 months ago

Chisato-Sophia commented 5 months ago

About $\mathrm{F}_K\in\ \mathbb{R}\ ^{L\times\ k\times\ C}$,I'm not sure how it's obtained. According to Eq. (5) $\mathbf{z^{\prime}}_\ell=\mathbf{LNP}(LN(\mathbf{z}_{\ell-1}+\mathbf{E}_{pos}))+\mathbf{z}_{\ell-1}$, LNP Input is $z_{\ell -1}+E_{pos}\in \mathbb{R} ^{\left( L+1 \right) \times C}$, but I don't know how to generate $\mathrm{F}_K$ from input. The "k" in LNP is totally different from the "K" in Patch Embeddings. Did you use the second KNN here? I don't know how to generate $L\times k\times C$ from $\left( L+1 \right) \times C$. Looking forward to your reply! Thank you!

xhanxu commented 5 months ago

I apologize for not detailing this part in the paper. Yes, the $K$ in the Patch Embedding and the $k$ in the LNP represent two different KNN operations. The $K$-NN in the Patch Embedding is performed on the original point cloud, while the $k$-NN in the LNP is performed on $L$ central points.

In the LNP, we use $k$-NN on the $L$ central points and then obtain the corresponding features, thus obtaining the $F_K$ of size $( L \times k \times C )$.

In fact, for the sake of computational simplicity and because the CLS token does not explicitly contain the local geometric information of the point cloud, we first remove it in the LNP block and then concatenate it back after processing through the LNP block. However, you can still obtain $L×k×C$ from the $(L+1)×C$, as you are using the xyz coordinates to calculate KNN, for each of the $L$ central points, you can get the indices of its $k$ neighbors. After that, for each central point, you use the $k$ indices to retrieve $k$ tokens from $L+1$ tokens, although you might need to pay attention to the index shift caused by the CLS token.

Thank you for pointing this out. Does this your questions?

Chisato-Sophia commented 5 months ago

Thanks for your reply! My problems were solved perfectly.