wvangansbeke / Unsupervised-Classification

SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]
https://arxiv.org/abs/2005.12320
Other
1.37k stars 268 forks source link

About "Pretext + Kmeans' #126

Open fi591 opened 2 years ago

fi591 commented 2 years ago

Hi, I see simply 'Pretext + Kmeans' achieves 65.2 on CIFAR10 on average. I download your model and tried it, but it only acquires 33.3%. Can you tell me your settings or something special you used?(I didn't see it in your code)

spdj2271 commented 1 year ago

I have the same question. This what I get after run simclr.py and then perform Kmeans on the learned representations on Cifar10 dataset. The clustering performance is much lower than Pretext + K-means (ACC=65.9 ) in table 1 of paper.

图片
wvangansbeke commented 1 year ago

You're not applying Kmeans correctly if you don't get 65%. You need to normalize the features and report Kmeans on the validation set by fitting them on the train set. You also need to average the results over multiple runs. The code for Kmeans clustering is mostly the same as I provided in other repositories, like semantic segmentation for example.

spdj2271 commented 1 year ago

Dear author, I obtain a silimar result with that in table 3 of the paper (ACC=65.9,NMI=59.8,ARI=50.9):

Evaluate with hungarian matching algorithm ... {'ACC': 0.6829, 'ARI': 0.4890204494614667, 'NMI': 0.5742197581324099, 'ACC Top-5': 0.9585,

However, the result I obtained is performing Kmeans on outputs of the clustering network (i.e., $\Phi\eta(X)\in \mathbb{R}^{10}$ ) before clustering training starts, but not on the output of the pretext network (i.e., $\Phi\theta\left(X\right)\in \mathbb{R}^{512}$ ). This seems a bug.

I believe performing Kmeans on features of pretext networks is important, which often severs as a baseline, even if the clustering network you proposed achieves good performance.

wvangansbeke commented 1 year ago

You should indeed cluster the pretext features, not the class vectors. Yes, we use KMeans as a baseline in our paper.
It's not clear to me what you believe is a 'bug' though. I was able to get the same numbers for KMeans with this codebase and the provided models. Most issues are caused by not normalizing properly or not using the correct pretext features.

spdj2271 commented 1 year ago

I means it is better to report the results of Kmeans on $\Phi\theta\left(X\right)$ rather than Kmeans on $\Phi\eta\left(X\right)$ in the row Pretext [7] + K-means of table 1 for clarity. But you seem to report the result of Kmeans on $\Phi_\eta\left(X\right)$, which is the bug I said. This is easy to misunderstand, at least for me.

Maybe the word bug I used is not so decent (non-native English speaker). I apologize for any inconvenience.

wvangansbeke commented 1 year ago

No, you are misunderstanding something. We cluster the features of Φθ and not Φη. The latter does not make sense.

gihanjayatilaka commented 1 year ago

Edit: I was able to replicate @wvangansbeke 's result. My code is not general enough to share (it is part of something convoluted).

Few issues that you might be running into are

  1. Make sure you extract the pretext features out of the model in the eval() state.
  2. Find the mean and std_dev in the train set (full dataset) and normalize both train and test datasets using these numbers.