Not dimensionality reduction in space, but can dimensionality reduction in channel
1x1 convolutions are used to compute reductions before the expensive 3x3 and 5x5 convolutions.
1x1xD convolutions not only reduce the features in input to the next layer, but also introduces new parameters and new non-linearity into the network that will help to increase model accuracy.
(?) WxHxD -> (1x1xD)x1 -> WxHx1
7x7x2048 -> 1x1x2048x512 -> 7x7x512
from 96 feature maps to 32 feature maps using 32 lters of 1x1 convolution
1x1 Convolution