Closed slmatrix closed 3 years ago
(..) provided that the parameters of the fully connected network (..)
should say layer.
conv6
will use 1024
filters, each with dimensions 1, 1, 1024
should say conv7 and dimensions should be 2, 2, 1024 because previous layer is decimated to 3, 3, 512?
@slmatrix FM10 is 3x3(x256), why FM11 is 1x1(x256)? I think it should be 2x2(x256) because use 2x2 max pooling, and use the mathematical ceiling function.
@AnhPC03, there is no max pooling. The auxiliary layers downsample their spatial dims by convolutions with zero padding (side effect of convolution is downsampling which is why padding is typically done in conv layers).
Closing this issue.
But pixel values are next to useless if we don't know the actual dimensions of the image.
Pixel values and also their representation as fractions of the image's dimension are equivalent. That is, they provide the same amount of information.