Closed benemer closed 3 years ago
Hello!!
Yes... Strangely I haven't pick up on that visualization error. We tried to have an extra layer on both sides and I must have forgotten to correctly update the figure.
It should go like 2048,64 1024, 32 512, 16 256, 8 128, 4
The final conv-1x1 shouldn't change the dimensionality either as you pointed out.
Thanks for pointing out!
Thanks a lot for the fast reply and clarification!
Since you use 4 pooling layers, I assume you mean:
2048,64 1024, 32 512, 16 256, 8 128, 4
Thanks again!
And I did it again ahah!!
Exactly that!
Hi!
From your
ResBlock
class, I can see that you use a constant downsampling rate of 2 by using thenn.AvgPool2d
layer withkernel_size=3
,stride=2
andpadding=1
.However, in your arXiv paper, the first residual block downsamples the width from 2048 to 512 which indicates a downsampling rate of 4. Also, I don't understand how the last layer upsamples the feature map from 1024x64x32 to 2048x64x32 since in your code, a Conv2d layer with
kernel_size=(1,1)
is used here.Is this a mistake in the visualization of the architecture?
Thank you!