Closed Lancelod-Liu closed 9 years ago
The conv2d
operation in Theano (and in all of the recent optimized versions I'm aware of, such as cudnn, caffe and cuda-convnet) does indeed have one 2D filter per input feature map and output feature map, and sums over all the input feature maps.
The easiest way to implement the exact LeNet5 architecture would be to initialize some filters to 0 where you do not want any connection, and make sure that you do not update those values (since the gradient wrt them will be non-zero).
Should we add a note in the tutorial to tell people about this difference and tell that now everybody do not do that anymore?
By now, LeNet5 is implemented using the below code to get the convolved output:
where
conv.conv2d
is a full-connected convolution operation. Each output feature map is got by convolving all the input feature maps with its kernel and suming them. However, we know that LeNet5 actually is not working that way. In LeNet5, each output feature map is got by convolving several selected feature maps with its kernel as below (image from Zohra Saidane and Christophe Garcia, “Automatic scene text recognition using a convolutional neural network,” in Proceedings of the Second International Workshop):This mechanism works as a regularization process. I think we could manually convolve some maps of the input and finally combine all the sub-outputs to achieve this. Here is my question, is there a simple way to achieve this in theano? Thanks.