wvangansbeke / Unsupervised-Classification

SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]
https://arxiv.org/abs/2005.12320
Other
1.35k stars 267 forks source link

The Implementation result of pretext task + kmeans #49

Closed forrestsz closed 3 years ago

forrestsz commented 3 years ago

Hi, Thank for your nice work. It inspired me most. I want to implement the result of pretext task + kmeans on CIFAR10 (65% ACC in paper) First, I download the checkpoint in here: https://drive.google.com/file/d/1Cl5oAcJKoNE5FSTZsBSAKLcyA5jXGgTT/view Then, I add some code in the eval.py as follow:

        print('Fill Memory Bank')
        fill_memory_bank(dataloader, model, memory_bank)

        if not args.simclr_kmeans:
            print('Mine the nearest neighbors')
            for topk in [1, 5, 20]: # Similar to Fig 2 in paper
                _, acc = memory_bank.mine_nearest_neighbors(topk)
                print('Accuracy of top-{} nearest neighbors on validation set is {:.2f}'.format(topk, 100*acc))
        else:
            head = 0
            print(memory_bank.features.cpu().shape)
            kmeans = KMeans(n_clusters=config['num_classes'], random_state=0).fit(memory_bank.features.cpu())
            kmeans = torch.from_numpy(kmeans.labels_).cuda()
            predictions=[{'predictions':kmeans,'probabilities':1, 'targets':memory_bank.targets}]
            clustering_stats = hungarian_evaluate_me(head, predictions, dataset.classes,
                                                  compute_confusion_matrix=True)
            print(clustering_stats)

But i get the following result, which is far lower than the 65% in paper

{'ACC': 0.3647, 'ARI': 0.13848755246278868, 'NMI': 0.2627059928586838, 'hungarian_match': [(0, 2), (1, 1), (2, 8), (3, 3), (4, 5), (5, 9), (6, 0), (7, 4), (8, 6), (9, 7)]}

Maybe I make some mistake in the calculation, can you tell me where I am wrong? Great thank for your time!

wvangansbeke commented 3 years ago

Hi @linqinghong,

Thank you for your interest. There are indeed a few issues.

Hope this helps.

forrestsz commented 3 years ago

Thank for your reply, I will try your suggestion as soon! Thank!

wvangansbeke commented 3 years ago

OK. Please reach out if something goes wrong. Closing this issue for now.

cag472 commented 3 years ago

I am having this same issue.

Thanks for any advice

07Agarg commented 3 years ago

Hi @cag472,

I was also getting around 45.0% K-means clustering accuracy on training features before MLP head.

But it got improved to around 65% ACC when I used l2-normalized training features before MLP head. I would recommend trying out l2-normalization.

Thanks.

TsungWeiTsai commented 3 years ago

Using Spherical k-means would provide slightly better results when applying to training+testing set. Please kindly refer to ICLR 2021 MiCE