wvangansbeke / Unsupervised-Classification

SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]
https://arxiv.org/abs/2005.12320
Other
1.37k stars 267 forks source link

Motivation of semantic clustering through scan-loss #96

Closed wetliu closed 2 years ago

wetliu commented 2 years ago

Thank you for your great work. According to your paper, the performance of nearest neighbors in after pretext task can be good, but there may not be cluster structure (please correct me if I am wrong).

May I confirm that the purpose of scan-loss is to, hopefully, pull samples together that potentially share the same labels so that cluster structure is formed?

And is Hungarian matching algorithm actually measuring the cluster structure performance? Is it the reason why the accuracy of the pretext different from the accuracy of hungarian matching algorithm even we use the same embeddings (after pretext task is finished and before semantic clustering begins)?

Thank you so much!

wvangansbeke commented 2 years ago

Hi,

Evaluating the predictions does not make sense without the hungarian matching. You need to know the mapping from clusters to the ground truth classes.

The scan-loss enforces neighboring samples to belong to the same cluster while also enforcing uniformity over the clusters. I refer you to the paper for more details.

wetliu commented 2 years ago

Thank you so much for your responds. The relationship between KNN and clusters are much clearer now. Thank you!