NorbertZheng / read-papers

My paper reading notes.
MIT License
8 stars 0 forks source link

Sik-Ho Tang | Review -- Striving for Simplicity: The All Convolutional Net. #97

Closed NorbertZheng closed 1 year ago

NorbertZheng commented 1 year ago

Sik-Ho Tang. Review — Striving for Simplicity: The All Convolutional Net.

NorbertZheng commented 1 year ago

Overview

Striving for Simplicity: The All Convolutional Net. All-CNN, by University of Freiburg. 2015 ICLR Workshop, Over 3800 Citations.

Convolutional Neural Network, CNN, Image Classification.

NorbertZheng commented 1 year ago

Proposed All-CNN

Two Means to Replace Pooling

image Convolution With Stride Larger Than 1 to Reduce Spatial Size More Aggressively (Figure from here).

There are two means suggested to replace pooling for spatial dimensionality reduction (i.e. along width & height axis, instead of filter axis):

The second option results in an increase of overall network parameters, yet without loss in accuracy.

NorbertZheng commented 1 year ago

Global Average Pooling to Replace Fully Connected Layer

image Global Average Pooling in NIN to Replace Fully Connected Layer (Image from NIN).

Using global average pooling to replace fully connected layer helps to reduce large amount of parameters.

NorbertZheng commented 1 year ago

Three Base Models

image The three base networks used for classification on CIFAR-10 and CIFAR-100.

Overall, three base models are suggested which consist only of convolutional layers with rectified linear non-linearities and an averaging + softmax layer to produce predictions over the whole image.

NorbertZheng commented 1 year ago

Three Derived Models from Base Models

image Model description of the three networks derived from base model C used for evaluating the importance of pooling in case of classification on CIFAR-10 and CIFAR-100.

Further enhanced models are derived from base models. The derived models for base models A and B are built analogously but not shown in the above table.

5×5 convolutions are replaced by 2 consecutive 3×3 convolutions.

NorbertZheng commented 1 year ago

Detailed Architecture

image Architecture of the Large All-CNN network for CIFAR-10.

The above shows the detailed architecture for CIFAR-10.

image Architecture of the ImageNet network.

The above shows the detailed architecture for ImageNet.

NorbertZheng commented 1 year ago

Experimental Results

Ablation Study

image All-CNN-C has the best performance.

SOTA Comparison

image Test error on CIFAR-10 and CIFAR-100 for the All-CNN compared to the state of the art from the literature.

On CIFAR-10, All-CNN is the All-CNN-C. It outperforms Maxout, and NIN, etc. On CIFAR-100, All-CNN-C obtains competitive performance.

ImageNet

An upscaled version of the All-CNN-B network is trained, which has 12 convolutional layers.

This network achieves a Top-1 validation error of 41.2% on ILSVRC-2012, when only evaluating on the center 224×224 patch, — which is comparable to the 40.7% Top-1 error reported by AlexNet.

NorbertZheng commented 1 year ago

(There are also sections to visualize the feature map response using deconvolutions which are something similar to ZFNet, please feel free to read the paper.)

NorbertZheng commented 1 year ago

Reference