NorbertZheng / read-papers

My paper reading notes.
MIT License
7 stars 0 forks source link

Sik-Ho Tang | Brief Review -- Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). #110

Closed NorbertZheng closed 1 year ago

NorbertZheng commented 1 year ago

Sik-Ho Tang. Brief Review — Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs).

NorbertZheng commented 1 year ago

Overview

Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), ELU, by Johannes Kepler University, 2016 ICLR, Over 5000 Citations. Image Classification, Autoencoder, Activation Function, ReLU, Leaky ReLU.

NorbertZheng commented 1 year ago

Exponential Linear Unit (ELU)

image The rectified linear unit (ReLU), the leaky ReLU (LReLU, α= 0.1), the shifted ReLUs (SReLUs), and the exponential linear unit (ELU, α = 1.0).

The ELU hyperparameter $\alpha$ controls the value to which an ELU saturates for negative net inputs: image

NorbertZheng commented 1 year ago

Results

MNIST

image (a): median of the average unit activation for different activation functions. (b): Training cross entropy loss.

ELUs stay have smaller median throughout the training process. The training error of ELU networks decreases much more rapidly than for the other networks.

NorbertZheng commented 1 year ago

Autoencoder

image Autoencoder training on MNIST: Reconstruction error for the test and training data set over epochs, using different activation functions and learning rates.

ELUs outperform the competing activation functions in terms of training / test set reconstruction error for all learning rates.

NorbertZheng commented 1 year ago

CIFAR-100

image Comparison of ReLUs, LReLUs, and SReLUs on CIFAR-100. (a-c) show the training loss, (d-f) the test classification error.

ELU networks achieved lowest test error and training loss.

NorbertZheng commented 1 year ago

CIFAR-10 & CIFAR-100

image Comparison of ELU networks and other CNNs on CIFAR-10 and CIFAR-100.

ELU-networks are the second best on CIFAR-10 with a test error of 6.55% but still they are among the top 10 best results reported for CIFAR-10. ELU networks performed best on CIFAR-100 with a test error of 24.28%. This is the best published result on CIFAR-100.

NorbertZheng commented 1 year ago

ImageNet

image ELU networks applied to ImageNet.

The ELU-network already reaches the 20% top-5 error after 160k iterations, while the ReLU network needs 200k iterations to reach the same error rate.

NorbertZheng commented 1 year ago

Reference