Yunfan-Li / Contrastive-Clustering

Code for the paper "Contrastive Clustering" (AAAI 2021)
MIT License
289 stars 92 forks source link

About the dataset concatenation #33

Open yuanhaoguo opened 2 years ago

yuanhaoguo commented 2 years ago

Excellent work! We are grateful that the codes can be released for study. I have a question about the creation of the dataset: dataset = data.ConcatDataset([train_dataset, test_dataset]) I guess here both train and test sets are used for training and testing. I can understand that in the task of unsupervised clustering, the true labels are invisible, so all data can be used for training. But I am wondering if this is a standard usage or definition in the field of "deep clustering", or I got a wrong understanding? Thanks~

Yunfan-Li commented 2 years ago

Most of the deep clustering methods concat the training and test set for both training and evaluation. Some works including ADC (Associative deep clustering: Training a classification network with no labels) and SCAN (SCAN: Learning to Classify Images without Labels) use training set for training and test set for evaluation. Personally, I feel that both settings could be adopted as long as you make it clear in the paper.