Open ShadowLau opened 7 years ago
@ShadowLau As written in the paper, we designed the input size to produce 3x3 feature maps at conv3. Our network converts a 75x75 input to 1x1 at conv3; the stride of conv3 w.r.t the input is 16 (=2x2x2x2x1), so a (75+16k)x(75+16k) input produces (1+k)x(1+k) at conv3.
@HyeonseobNam Thank you. I can understand 107x107 to 3x3 step by step (layer by layer). I just can not understand why stride is 16. Maybe you mean "x2 pool" equals to "stride 2"?
@ShadowLau Right :) Pooling sizes equal to pooling strides in our network.
@HyeonseobNam Get it :) Thank you very much!
@HyeonseobNam Thanks for your generous to make your excellent work open. Recently, when i read your paper, i am confused with your network architecture. In your paper, you said that the input size is "107 = 75 (receptive) + 216 (stride)". Can you explain me how to get the "75" and "216"? Thank you very much again.