How to implement the real LeNet5 in theano?

Lancelod-Liu commented 9 years ago

By now, LeNet5 is implemented using the below code to get the convolved output:

conv_out = conv.conv2d(input=input, filters=self.W,
                                        filter_shape=filter_shape, image_shape=image_shape)

where conv.conv2d is a full-connected convolution operation. Each output feature map is got by convolving all the input feature maps with its kernel and suming them. However, we know that LeNet5 actually is not working that way. In LeNet5, each output feature map is got by convolving several selected feature maps with its kernel as below (image from Zohra Saidane and Christophe Garcia, “Automatic scene text recognition using a convolutional neural network,” in Proceedings of the Second International Workshop):

1 automatic scene text recognition using a convolutional neural network

This mechanism works as a regularization process. I think we could manually convolve some maps of the input and finally combine all the sub-outputs to achieve this. Here is my question, is there a simple way to achieve this in theano? Thanks.

lamblin commented 9 years ago

The conv2d operation in Theano (and in all of the recent optimized versions I'm aware of, such as cudnn, caffe and cuda-convnet) does indeed have one 2D filter per input feature map and output feature map, and sums over all the input feature maps. The easiest way to implement the exact LeNet5 architecture would be to initialize some filters to 0 where you do not want any connection, and make sure that you do not update those values (since the gradient wrt them will be non-zero).

nouiz commented 9 years ago

Should we add a note in the tutorial to tell people about this difference and tell that now everybody do not do that anymore?

lisa-lab / DeepLearningTutorials

How to implement the real LeNet5 in theano? #78