Pointcept / PointTransformerV3

[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
MIT License
798 stars 46 forks source link

Feature Pyramid implementation #67

Open aniket-gupta1 opened 4 months ago

aniket-gupta1 commented 4 months ago

Hi, Thank you for your amazing work!

I am trying to get features use PtV3 to generate feature pyramids for a pointcloud. So for example, if I have a network with 3 encoder and 3 decoder layers, I want to do something like this: Screenshot from 2024-06-17 19-30-57

Can you please give some pointers on how can I do that?

Gofinge commented 4 months ago

Sorry for the late response. You can record a copy of the feature from the decoder during unspooling to prevent it from being decoded by the incoming decoder and then ensure it is also upsampled during the next unspooling layers.

Modifying the unpooling is enough to achieve this.

aniket-gupta1 commented 3 months ago

But even after cloning, there is still the issue that the "pooling_parents" are deleted after every unpooling step. So even if I cloned the output of dec3 and used a separate unpooling layer, It will only increase the number of points by one step.

Basically, if N is the original number of points and Enc1 downsamples to N', Enc2 -> N'', Enc3 -> N'''. The output of Dec3 -> N'', so unpooling here will give me only N' points which can't be directly added to the output Dec1 which has N points

Gofinge commented 2 months ago

Hi, you can use "pooling_inverse" to map these embeddings to the original shape.