facebookresearch / dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Apache License 2.0
6.06k stars 885 forks source link

How does a zero KLD indicate a collapse? #235

Open YoojLee opened 1 year ago

YoojLee commented 1 year ago

Hi,

image image

According to the highlighted sentences in yellow from the attached paragraph, zero KLD indicates a constant output and hence collapse. However, I thought the zero value of KLD means two distributions (here, the teacher output distribution and the student one) become identical. I don't understand why two distributions becoming identical means a collapse and what a constant output exactly means. If someone gives me a hint, that would be a huge help! Thanks!