TiagoCortinhal / SalsaNext

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving
MIT License
417 stars 102 forks source link

Downsampling rate #48

Closed benemer closed 3 years ago

benemer commented 3 years ago

Hi!

From your ResBlock class, I can see that you use a constant downsampling rate of 2 by using the nn.AvgPool2d layer with kernel_size=3, stride=2 and padding=1.

However, in your arXiv paper, the first residual block downsamples the width from 2048 to 512 which indicates a downsampling rate of 4. Also, I don't understand how the last layer upsamples the feature map from 1024x64x32 to 2048x64x32 since in your code, a Conv2d layer with kernel_size=(1,1) is used here.

Is this a mistake in the visualization of the architecture?

Thank you!

TiagoCortinhal commented 3 years ago

Hello!!

Yes... Strangely I haven't pick up on that visualization error. We tried to have an extra layer on both sides and I must have forgotten to correctly update the figure.

It should go like 2048,64 1024, 32 512, 16 256, 8 128, 4

The final conv-1x1 shouldn't change the dimensionality either as you pointed out.

Thanks for pointing out!

benemer commented 3 years ago

Thanks a lot for the fast reply and clarification!

Since you use 4 pooling layers, I assume you mean:

2048,64 1024, 32 512, 16 256, 8 128, 4

Thanks again!

TiagoCortinhal commented 3 years ago

And I did it again ahah!!

Exactly that!