pocca2048 / ML-paper-reading

Daily ML paper reading
0 stars 0 forks source link

Understanding self-supervised Learning Dynamics without Contrastive Pairs #5

Open pocca2048 opened 3 years ago

pocca2048 commented 3 years ago

Conference : Link : http://arxiv.org/abs/2102.06810 Authors' Affiliation : Facebook AI Research TL;DR : SimSiam 논문의 원인을 탐구하는 논문. "How can SSL with only positive pairs avoid representational collapse?"

Summary :

1. Introduction

Minimizing differences between positive pairs encourages modeling invariances, while contrasting negative pairs is thought to be required to prevent representational collapse

그러나 요즘 나온 BYOL, SimSiam 과 같은 논문들에서는 negative없이 성공했다.

근데 이 논문들이 왜 representation collapse를 겪지 않는건지는 아직 밝혀지지 않았다.

analyze the behavior of non-contrastive SSL training and the empirical effects of multiple hyperparameters, including (1) Exponential Moving Average (EMA) or momentum encoder, (2) Higher relative learning rate (αp) of the predictor, and (3) Weight decay η

We explain all these empirical finding with an exceedingly simple theory based on analyzing the nonlinear learning dynamics of simple linear networks.

2. Two-layer linear model

Theorem 1 (Weight decay promotes balancing of the predictor and online networks

Theorem 2 (The stop-gradient signal is essential for success.)

3. How multiple factors affect learning dynamics

4. Optimization-free Predictor $W_p$

노잼이라 그만...

pocca2048 commented 3 years ago

4