duggalrahul / AlexNet-Experiments-Keras

Code examples for training AlexNet using Keras and Theano
MIT License
107 stars 53 forks source link

Why split tensor for layers 2, 4 and 5? #7

Open G33kyKitty opened 6 years ago

G33kyKitty commented 6 years ago

Hello Rahul,

I am new to deep machine learning.

I came across your codes AlexNet-Experiments-Keras on Github. Thank you for the well documented guidelines. It really helped me.

However, I cannot understand why you split the tensor first before performing the convolution for layers 2, 4 and 5? (https://github.com/duggalrahul/AlexNet-Experiments-Keras/blob/master/convnets-keras/convnetskeras/customlayers.py)

Also, why is this process not done on layer 3? (https://github.com/duggalrahul/AlexNet-Experiments-Keras/blob/master/Code/alexnet_base.py)

I would be very grateful if you could answer my question.

Looking forward for your response.

Thanking you in advance.

Kind regards..

duggalrahul commented 6 years ago

Hi @G33kyKitty

The split tensor is used to implement the specific AlexNet architecture design specified in the original paper. This was done so that the upper and lower halves of the network (refer to fig. 2 of paper) could be run on separate GPU's. This helped in keeping memory per GPU low (Try calculating the size of the convolution output feature maps for both with and without splitting). Since current GPU's offer much more memory than was available in 2011, the idea of splitting is not as important. The entire AlexNet architecture can fit on most modern GPUs.

G33kyKitty commented 6 years ago

@duggalrahul Thank you..