CNNs have made recent advances in visual recognition, speech recognition, and natural language processing, thanks to the rapid growth in labeled data and powerful GPUs. This paper goes over the various reasons for improving CNNs including layer design, activation function, loss function, regularization, optimization and fast computation.
Also discusses applications to object detection and object tracking.
Key Points
Convolutional layers look for feature representations of the image. "Receptive field connecting layers"
Feature maps are obtained by convolving a set of learned kernels across the image and then applying the activation function to the results.
The pooling layer aims to achieve shift-invariance by reducing the resolution of the feature maps. Pooling layers are connected to the preceding convolutional layer.
By stacking several convolutional and pooling layers, we could gradually extract higher level feature representations
After several convolutional and pooling layers, there may be one or more fully-connected layers which aim to perform high level reasoning. They take all neurons in the previous layer and connect them to every single neuron of current layer to generate global semantic information
Title
Recent advances in convolutional neural networks
URL
https://reader.elsevier.com/reader/sd/pii/S0031320317304120?token=2A14E8B33C2A53456298CF81D24AD9FD0DD005EB595ECA9FFC1E18638299B1BA5AA3EF3D8A6A9FE07FC6EDFE787EC0BC&originRegion=us-east-1&originCreation=20230223175219
Summary
CNNs have made recent advances in visual recognition, speech recognition, and natural language processing, thanks to the rapid growth in labeled data and powerful GPUs. This paper goes over the various reasons for improving CNNs including layer design, activation function, loss function, regularization, optimization and fast computation. Also discusses applications to object detection and object tracking.
Key Points
Convolutional layers look for feature representations of the image. "Receptive field connecting layers" Feature maps are obtained by convolving a set of learned kernels across the image and then applying the activation function to the results. The pooling layer aims to achieve shift-invariance by reducing the resolution of the feature maps. Pooling layers are connected to the preceding convolutional layer. By stacking several convolutional and pooling layers, we could gradually extract higher level feature representations After several convolutional and pooling layers, there may be one or more fully-connected layers which aim to perform high level reasoning. They take all neurons in the previous layer and connect them to every single neuron of current layer to generate global semantic information
Citation
Repo link