wvangansbeke / Unsupervised-Classification

SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]
https://arxiv.org/abs/2005.12320
Other
1.37k stars 267 forks source link

I want to train with my own dataset #64

Closed anewusername77 closed 3 years ago

anewusername77 commented 3 years ago

its image datset without labels, should i create it like imagenet-style datasets? i mean images of different labels in different folders

anewusername77 commented 3 years ago

but i don't have labels

wvangansbeke commented 3 years ago

Hi @scarletteshu,

Thank you for your interest.

Yes. You need to write your own dataset (e.g. data/cifar.py). Please refer to the following issues: #8, #19, #34. They might be useful. Also, since you don't have labels available, you will have to remove the evaluation code.

anewusername77 commented 3 years ago

Hi @scarletteshu,

Thank you for your interest.

Yes. You need to write your own dataset (e.g. data/cifar.py). Please refer to the following issues: #8, #19, #34. They might be useful. Also, since you don't have labels available, you will have to remove the evaluation code.

thanks a lot! i'm new to this, i'll ask you again if i got any more problems. thanks again~

anewusername77 commented 3 years ago

dear author, my new questions are as follows:

expecting your response~(sorry to have so many questions)

wvangansbeke commented 3 years ago

Hi @scarletteshu,

Yes, you will have to modify the code. If you don't have labels, you can't compute the accuracy. You can remove that part. The validation loss is used to select the best model. You can define your own validation set or take the final model.

anewusername77 commented 3 years ago

thanks for your reply, when I trained cifar10, losses were like consistency loss 8.5809e-01 entropy 2.3005e+00 but when I trained my own dataset, consistency loss was always close to entropy, and predctions['probabilities'] were close to each other (such as 0,1001, 0,1012,...), what do you think the problem is? I only changed transforms as ours and learning rate in config file, comparing to scan_imagenet_50.yml

wvangansbeke commented 3 years ago

Hi @scarletteshu,

Hard to say what the problem is exactly. Especially since I don't know the dataset. However, lowering the weight in the loss will likely help.

wvangansbeke commented 3 years ago

If there are still issues let me know. Closing this for now.