sefibk / KernelGAN

Other
337 stars 77 forks source link

network parameter asking #63

Open RaySunWHUT opened 3 years ago

RaySunWHUT commented 3 years ago

Hi, Mr. selfibk, I have a confusion that what is the meaning of number 32 in the generator(G) network architecture ?

How did the conv. perform the operation of image size 64 --> image size 32?

image

sefibk commented 3 years ago

This is a basic convolutional layer. A convolutional layer with stride 2 resizes the image from 64x64 to 32x32. If you use 64 filters of size 3 you get an output with 64 channels. Hence that is how an input of 64x64x3 => 32x32x64

RaySunWHUT commented 3 years ago

but, I visual the Generator network architecture, the stride is (1, 1); And the first layer has kernel_size = (7 * 7), in this situation, shouldn't this output result conv. size after input image to the first layer is 64 - 7 + 1 = 58 ??

image

sefibk commented 3 years ago

Then I might have a mistake in the figure. It seems the stride is the last layer. Sorry for that

RaySunWHUT commented 3 years ago

ok, I still have another confusion: the Generator architecture hasn't set the "padding" attribution, then every Conv. layer will narrow the image feature map size; Although if we set the stride = 2 will downscale x2 with the original image(I think it will do that only when you set the "padding" attribution), however, if you don't set it, how to ensure the Generator will model the linear transformation "*K | 2" correctly when there is no padding setting ?

sefibk commented 3 years ago

Pytorch's default pads the image to keep the resolution. In order to ignore boundaries artifacts I discard them (as far as I recall - I use a mask to discard them in the loss or perhaps simply crop them out)

RaySunWHUT commented 3 years ago

thanks, I saw the mask/shave code. But, I don't think Pytorch default will pad to keep the resolution. because it doesn't have the padding "same-model". https://discuss.pytorch.org/t/same-padding-equivalent-in-pytorch/85121

and the print function results also confirm the conclusion above.