thuml / Deep-Embedded-Validation

Code release for Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation (ICML 2019)
MIT License
62 stars 10 forks source link

About DEV Method #3

Open TomSheng21 opened 3 years ago

TomSheng21 commented 3 years ago

Thanks for your great work. I use the model in Tranfer-Learning-Library and the method in DEV to search the best learning rate. When I select the hyperparameters, the validation set separated from the source domain does not participate in the training, but after selecting the hyperparameters, do I need to train with the complete source domain to get the final model? And the parameters in the experiment in the DEV paper, such as the office31 dataset which has six tasks, the final selected learning rate and loss trade-off are one group or six groups ( a group hyperparameters for each task or the whole dataset) Looking forward to your answer, thanks a lot.

TomSheng21 commented 3 years ago

AND What percentage of the data in the training set will be split as validation set

youkaichao commented 3 years ago

Hi, sorry for the late reply (I forgot to set the notification of this repo =_=|| ).

  1. after selecting the hyperparameters, do I need to train with the complete source domain to get the final model?

Yes, you are encouraged to do so, which is a standard procedure. However, people in deep learning today sometimes would rather not do this, because deep learning requires long-time training, making the last training with the complete source domain tedious. You are free to choose one protocol, as long as you specify the protocol clearly enough.

  1. the final selected learning rate and loss trade-off are one group or six groups?

six groups, each group for each task. The significance of DEV is that, it proposes a standard protocol to tune hyper-parameters in domain adaptation, so you can tune hyper-parameters for each task. Previously, people would tune one group of hyper-parameters in one task, but that somehow breaks the unsupervised setting.

  1. What percentage of the data in the training set will be split as validation set?

Typically we use 20% for validation.