Closed InstantWindy closed 5 years ago
@yun-liu I guess this is not where richer convolution feature is from. The introduced 1x1 conv after 3x3 is used to decrease the channel and organize the feature map. After eltwise layer, they are somehow added together, which could be visualized as hyper-column.
Hello! I wonder why richer convolution can be obtained by adding 1x1 convolution after 3x3 convolution? Thank you !