Questions about the split of office-31 dataset

tim-learn / SHOT

code released for our ICML 2020 paper "Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation"

MIT License

437 stars 78 forks source link

Questions about the split of office-31 dataset #7

Closed Hongbin98 closed 4 years ago

Hongbin98 commented 4 years ago

In the original paper of the office-31 dataset, the author used 8 labels per category for webcam/dslr and 20 for amazon to train the model in the source and target domain. And others used 3 labels per category to test the model and get the accuracy. However, there is a "0.9/0.1 train/test" split of the office-31 dataset in your code. So I wonder if your settings are different from the original paper? I want to cite your paper in our work. So I am looking forward to your reply~

tim-learn commented 4 years ago

@Anthem0w0 As recent papers on UDA always do, we report the accuracy on the entire target domain. The split is used for source model training only, and the 0.1 split (validation) is utilized to determine the best source model.