xu-ji / IIC

Invariant Information Clustering for Unsupervised Image Classification and Segmentation
MIT License
865 stars 207 forks source link

mapping/assignment dataloader #71

Closed sc-56 closed 4 years ago

sc-56 commented 4 years ago

Sorry for asking,

I'm really wondering what is in the "mapping/assigment dataloader" in the evaluation phase, and is the same content in the "mapping dataloader" and "assignment dataloader"?

Thanks for your support.

xu-ji commented 4 years ago

For fully unsupervised clustering training and test sets are allowed to be the same so mapping_assignment_dataloader == mapping_test_dataloader.

For semi-supervised overclustering which makes material use of labels to find the cluster-to-class mapping, test set has to be unseen (as in supervised evaluations), so mapping_assignment_dataloader != mapping_test_dataloader (mapping_assignment_dataloader counts as training data but mapping_test_dataloader does not).

xu-ji commented 4 years ago

See also this thread on data loading, I added a simpler function.

sc-56 commented 4 years ago

Thx for the support, and sorry for furthuer asking.

In the situation of 'unsupervised learning', what is the difference of content between the [training dataset] and [mapping_assignment_dataset], is the only way of transformation (tf2 vs. tf3) different?

xu-ji commented 4 years ago

tf2 and tf3 are different, and sometimes the underlying data is different.

All the data partitions for unsupervised learning are listed in this function. In most cases [training dataset] and [mapping_assignment_dataset] have the same underlying data. The exception is STL10, where mapping_assignment_dataset excludes the unlabelled portion of the training data (because mapping_assignment_dataset is used to find the cluster-to-class mapping).

sc-56 commented 4 years ago

Thank you so much for your comprehensive explanation.

really appreciated, and thx a lot.