WangYueFt / dgcnn

MIT License
1.62k stars 421 forks source link

Point cloud transform block different from paper #16

Closed patrick-llgc closed 4 years ago

patrick-llgc commented 5 years ago

Hi!

Thanks for sharing the code. I am reading the paper and found that the implementation of the point cloud transform block (aka T-Net in the original point net paper) is different from what is mentioned in the code.

In the paper, as shown in Fig. 3, the coordinate difference the k nearest neighbor and the coordinates of the point is concatenated (therefor n x k x (3+3) = n x k x 6).

However in the code, there is one additional max pooling along the number of points axis, as compared to the original point net implementation. I do not quite understand why.

If someone can enlighten me or share their thought on this, I would be very grateful.

Edit: I think I figured out why the input of the t-net is n x k x 6 as the input already went through feature transformation defined here. However I am still puzzled by the additional max pooling operation as compared to the original point net implementation.

WangYueFt commented 4 years ago

Hi!

Thanks for sharing the code. I am reading the paper and found that the implementation of the point cloud transform block (aka T-Net in the original point net paper) is different from what is mentioned in the code.

In the paper, as shown in Fig. 3, the coordinate difference the k nearest neighbor and the coordinates of the point is concatenated (therefor n x k x (3+3) = n x k x 6).

However in the code, there is one additional max pooling along the number of points axis, as compared to the original point net implementation. I do not quite understand why.

If someone can enlighten me or share their thought on this, I would be very grateful.

Edit: I think I figured out why the input of the t-net is n x k x 6 as the input already went through feature transformation defined here. However I am still puzzled by the additional max pooling operation as compared to the original point net implementation.

We initially wanted to use EdgeConv in the spatial transformer as we did for the main backbone.