NorbertZheng / read-papers

My paper reading notes.
MIT License
8 stars 0 forks source link

Sik-Ho Tang | Brief Review -- Representation Learning by Learning to Count. #130

Closed NorbertZheng closed 1 year ago

NorbertZheng commented 1 year ago

Sik-Ho Tang. Brief Review — Representation Learning by Learning to Count.

NorbertZheng commented 1 year ago

Overview

Self-Supervised Learning By Counting Number of Visual Features.

Representation Learning by Learning to Count, Counting, by University of Bern, and University of Maryland. 2017 ICCV, Over 300 Citations. Self-Supervised Learning, Image Classification.

NorbertZheng commented 1 year ago

Counting

Conceptual Idea

image The number of visual primitives in the whole image should match the sum of the number of visual primitives in each tile.

We get more data!!! The number of visual primitives in each region should sum up to the number of primitives in the original image.

NorbertZheng commented 1 year ago

Arithmetic calculation of high-level factors (i.e. high-level features selected by feature engineering).

NorbertZheng commented 1 year ago

Contrastive Loss

image Training AlexNet to learn to count.

Assume $x$ is color image input, the naïve way for training the network, $D$ is downsampling operator, $T$ is tiling operator to divide the image $x$ into 4 non-overlapping parts: image

where $\Phi$ is the CNN to be learnt to count the visual features.

Therefore, for any $x\neq y$, we would like to minimize: image

where the constant scalar $M=10$.

The contrastive term will introduce a tradeoff that will push features towards counting as many primitives as is needed to differentiate images from each other.

NorbertZheng commented 1 year ago

Network Architecture

Input is 114×114 image. AlexNet with ReLU at the end is used.

The counting network is trained on the 1.3M (i.e. 1'300'000) images from the training set of ImageNet.

NorbertZheng commented 1 year ago

A kind of disentangled (?) representation, but selected through feature engineering.

NorbertZheng commented 1 year ago

Results

image Evaluation of transfer learning on PASCAL.

The proposed method either outperforms previous methods or achieve the second best performance.

image ImageNet classification with a linear classifier.

image Places classification with a linear classifier.

The proposed method achieves a performance comparable to the other state-of-the-art methods on the ImageNet dataset and shows a significant improvement on the Places dataset.

NorbertZheng commented 1 year ago

Reference