Open G33kyKitty opened 6 years ago
Hi @G33kyKitty
The split tensor is used to implement the specific AlexNet architecture design specified in the original paper. This was done so that the upper and lower halves of the network (refer to fig. 2 of paper) could be run on separate GPU's. This helped in keeping memory per GPU low (Try calculating the size of the convolution output feature maps for both with and without splitting). Since current GPU's offer much more memory than was available in 2011, the idea of splitting is not as important. The entire AlexNet architecture can fit on most modern GPUs.
@duggalrahul Thank you..
Hello Rahul,
I am new to deep machine learning.
I came across your codes AlexNet-Experiments-Keras on Github. Thank you for the well documented guidelines. It really helped me.
However, I cannot understand why you split the tensor first before performing the convolution for layers 2, 4 and 5? (https://github.com/duggalrahul/AlexNet-Experiments-Keras/blob/master/convnets-keras/convnetskeras/customlayers.py)
Also, why is this process not done on layer 3? (https://github.com/duggalrahul/AlexNet-Experiments-Keras/blob/master/Code/alexnet_base.py)
I would be very grateful if you could answer my question.
Looking forward for your response.
Thanking you in advance.
Kind regards..