facebookresearch / deepcluster

Deep Clustering for Unsupervised Learning of Visual Features
Other
1.66k stars 324 forks source link

Data Argumentation for Training #52

Closed tao0420 closed 4 years ago

tao0420 commented 4 years ago

Hi there,

Thanks a lot for the contribution and the code is amazingly well organized!

I found that when doing the clustering, the data argumentation you used is very simple and also with "centercrop". May I know if there is any specific reason of doing that? Or do you try to use other data argumentations?

mathildecaron31 commented 4 years ago

Hi, thank you for your interest in our work. We indeed choose to cluster the representations from the center crop of the images. Then we use stronger data augmentation during training: https://github.com/facebookresearch/deepcluster/blob/9796a71abbfd14181a2b117d6244e60c2d94efbf/clustering.py#L142 Indeed, we consider that each augmented version of an image belongs to the cluster of its center crop. Actually, data augmentation during training is crucial for the method to work well and I think that using stronger data augmentation might improve furthermore the features quality. We didn't experiment on data augmentation for the clustering step though. Please re-open the issue if you have further questions.