Open RaySunWHUT opened 3 years ago
This is a basic convolutional layer. A convolutional layer with stride 2 resizes the image from 64x64 to 32x32. If you use 64 filters of size 3 you get an output with 64 channels. Hence that is how an input of 64x64x3 => 32x32x64
but, I visual the Generator network architecture, the stride is (1, 1); And the first layer has kernel_size = (7 * 7), in this situation, shouldn't this output result conv. size after input image to the first layer is 64 - 7 + 1 = 58 ??
Then I might have a mistake in the figure. It seems the stride is the last layer. Sorry for that
ok, I still have another confusion: the Generator architecture hasn't set the "padding" attribution, then every Conv. layer will narrow the image feature map size; Although if we set the stride = 2 will downscale x2 with the original image(I think it will do that only when you set the "padding" attribution), however, if you don't set it, how to ensure the Generator will model the linear transformation "*K | 2" correctly when there is no padding setting ?
Pytorch's default pads the image to keep the resolution. In order to ignore boundaries artifacts I discard them (as far as I recall - I use a mask to discard them in the loss or perhaps simply crop them out)
thanks, I saw the mask/shave code. But, I don't think Pytorch default will pad to keep the resolution. because it doesn't have the padding "same-model". https://discuss.pytorch.org/t/same-padding-equivalent-in-pytorch/85121
and the print function results also confirm the conclusion above.
Hi, Mr. selfibk, I have a confusion that what is the meaning of number 32 in the generator(G) network architecture ?
How did the conv. perform the operation of image size 64 --> image size 32?