Closed Chisato-Sophia closed 5 months ago
I apologize for not detailing this part in the paper. Yes, the $K$ in the Patch Embedding and the $k$ in the LNP represent two different KNN operations. The $K$-NN in the Patch Embedding is performed on the original point cloud, while the $k$-NN in the LNP is performed on $L$ central points.
In the LNP, we use $k$-NN on the $L$ central points and then obtain the corresponding features, thus obtaining the $F_K$ of size $( L \times k \times C )$.
In fact, for the sake of computational simplicity and because the CLS token does not explicitly contain the local geometric information of the point cloud, we first remove it in the LNP block and then concatenate it back after processing through the LNP block. However, you can still obtain $L×k×C$ from the $(L+1)×C$, as you are using the xyz coordinates to calculate KNN, for each of the $L$ central points, you can get the indices of its $k$ neighbors. After that, for each central point, you use the $k$ indices to retrieve $k$ tokens from $L+1$ tokens, although you might need to pay attention to the index shift caused by the CLS token.
Thank you for pointing this out. Does this your questions?
Thanks for your reply! My problems were solved perfectly.
About $
\mathrm{F}_K\in\ \mathbb{R}\ ^{L\times\ k\times\ C}
$,I'm not sure how it's obtained. According to Eq. (5) $\mathbf{z^{\prime}}_\ell=\mathbf{LNP}(LN(\mathbf{z}_{\ell-1}+\mathbf{E}_{pos}))+\mathbf{z}_{\ell-1}
$, LNP Input is $z_{\ell -1}+E_{pos}\in \mathbb{R} ^{\left( L+1 \right) \times C}
$, but I don't know how to generate $\mathrm{F}_K
$ from input. The "k" in LNP is totally different from the "K" in Patch Embeddings. Did you use the second KNN here? I don't know how to generate $L\times k\times C
$ from $\left( L+1 \right) \times C
$. Looking forward to your reply! Thank you!