Last decoder output feature should be 8 instead is 32

lsaeuro commented 2 years ago

Hi, thank you to your work, it is very useful for me. BTW, looking at the model structure I noticed a difference with the original network, in the last decoder dimension. Your structure is :
(decoder_blocks): ModuleList( (0): Conv2d( (conv): Conv2d(768, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d( (bn): BatchNorm2d(256, eps=1e-06, momentum=0.99, affine=True, track_running_stats=True) ) (activation): LeakyReLU(negative_slope=0.2, inplace=True) ) (1): Conv2d( (conv): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d( (bn): BatchNorm2d(128, eps=1e-06, momentum=0.99, affine=True, track_running_stats=True) ) (activation): LeakyReLU(negative_slope=0.2, inplace=True) ) (2): Conv2d( (conv): Conv2d(160, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d( (bn): BatchNorm2d(32, eps=1e-06, momentum=0.99, affine=True, track_running_stats=True) ) (activation): LeakyReLU(negative_slope=0.2, inplace=True) ) (3): Conv2d( (conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d( (bn): BatchNorm2d(32, eps=1e-06, momentum=0.99, affine=True, track_running_stats=True) ) (activation): LeakyReLU(negative_slope=0.2, inplace=True) ) )

We can see that the last layer of the decoder has 32 feature while according to the network it should be downsampled by 4, so it should be 8.

Moreover we can see it also here, where for an input point cloud of 1000 points ( a sample pc), with batch_size = 6, we obtain as last decoder layer torch.Size([6, 32, 1000, 1]).

My suggestion is to write here , d_out = 8, or something less hard coded, but with 8 as dimension.

If I am wrong, could you please explain me the meaning and the reason of that output? Thank you in advance

773041642 commented 2 years ago

The output of the official code is 32 dimensions.

lsaeuro commented 2 years ago

The output of the official code is 32 dimensions.

Yes indeed i am talking about the decoder output, not the network output

lsaeuro commented 2 years ago

The output of the official code is 32 dimensions.

Yes indeed i am talking about the decoder output, not the network output

Anyway, is it or not N=8 in the paper? I am confused on why they changed it

773041642 commented 2 years ago

The output of the official code is 32 dimensions.

Yes indeed i am talking about the decoder output, not the network output

Anyway, is it or not N=8 in the paper? I am confused on why they changed it

I know what you mean, the decoder output of the official code is 32 dimensions, you can debug it to look the result.

lsaeuro commented 2 years ago

Thank you, I see.. I will try both implementation to see which one performs better at this point I think

773041642 commented 2 years ago

Thank you, I see.. I will try both implementation to see which one performs better at this point I think

Do you know the reason for this processing? I haven't found any disordered operations in the official code.

Thank you very much!

lsaeuro commented 2 years ago

I think I do not have the answer to this... Do you know where in the paper is mentioned this probability sampling strategy? @773041642

773041642 commented 2 years ago

Random sampling is not specifically explained in the paper. You can take a look at the implementation method in the code, or see other people's explanations about the paper.

lsaeuro commented 2 years ago

Thank you for your answers.. I was looking at the code to figure out also if they use the variable 'cloud_ind' , in this Pytorch Version it is never used while in the TF version they use it here

Do you know the reason why?

huixiancheng commented 2 years ago

It's only use for test full point cloud. This repo don't generate test. So it's unuseful.

qiqihaer / RandLA-Net-pytorch

Last decoder output feature should be 8 instead is 32 #20