About data loading and training sets

evablanco commented 2 years ago

Hi!

I am testing the Office31 dataset with various models, such as DANN. When analyzing the code, it seems that the validation and test set is the same, specifically it corresponds to the target set. From what I understand, when choosing the best training model a subset of the source set should be used, since being unsupervised domain adaptation, the label information of the target set should not be known. However, when I make this change in the code the accuracy plummets. What is the reason for using the target set in model validation? Shouldn't the source set be used for this purpose?

Thanks

thucbx99 commented 2 years ago

Indeed, in general machine learning tasks, the label information of the target set should not be known. In the field of domain adaptation, the choice of validation set remains an open problem. Currently, most methods directly adopt the test set for validation. Therefore, we follow there implementation.

JunguangJiang commented 2 years ago

Hope https://github.com/thuml/Transfer-Learning-Library/issues/15, https://github.com/thuml/Transfer-Learning-Library/issues/28 and https://github.com/thuml/Transfer-Learning-Library/issues/47 could answer your question.

thuml / Transfer-Learning-Library

About data loading and training sets #141