Open fi591 opened 2 years ago
I have the same question. This what I get after run simclr.py and then perform Kmeans on the learned representations on Cifar10 dataset. The clustering performance is much lower than Pretext + K-means (ACC=65.9 )
in table 1 of paper.
You're not applying Kmeans correctly if you don't get 65%. You need to normalize the features and report Kmeans on the validation set by fitting them on the train set. You also need to average the results over multiple runs. The code for Kmeans clustering is mostly the same as I provided in other repositories, like semantic segmentation for example.
Dear author, I obtain a silimar result with that in table 3 of the paper (ACC=65.9,NMI=59.8,ARI=50.9):
Evaluate with hungarian matching algorithm ... {'ACC': 0.6829, 'ARI': 0.4890204494614667, 'NMI': 0.5742197581324099, 'ACC Top-5': 0.9585,
However, the result I obtained is performing Kmeans on outputs of the clustering network (i.e., $\Phi\eta(X)\in \mathbb{R}^{10}$ ) before clustering training starts, but not on the output of the pretext network (i.e., $\Phi\theta\left(X\right)\in \mathbb{R}^{512}$ ). This seems a bug.
I believe performing Kmeans on features of pretext networks is important, which often severs as a baseline, even if the clustering network you proposed achieves good performance.
You should indeed cluster the pretext features, not the class vectors. Yes, we use KMeans as a baseline in our paper.
It's not clear to me what you believe is a 'bug' though. I was able to get the same numbers for KMeans with this codebase and the provided models. Most issues are caused by not normalizing properly or not using the correct pretext features.
I means it is better to report the results of Kmeans on $\Phi\theta\left(X\right)$ rather than Kmeans on $\Phi\eta\left(X\right)$ in the row Pretext [7] + K-means
of table 1 for clarity. But you seem to report the result of Kmeans on $\Phi_\eta\left(X\right)$, which is the bug I said. This is easy to misunderstand, at least for me.
Maybe the word bug
I used is not so decent (non-native English speaker). I apologize for any inconvenience.
No, you are misunderstanding something. We cluster the features of Φθ and not Φη. The latter does not make sense.
Edit: I was able to replicate @wvangansbeke 's result. My code is not general enough to share (it is part of something convoluted).
Few issues that you might be running into are
Hi, I see simply 'Pretext + Kmeans' achieves 65.2 on CIFAR10 on average. I download your model and tried it, but it only acquires 33.3%. Can you tell me your settings or something special you used?(I didn't see it in your code)