Closed YiDongOuYang closed 3 years ago
Can you share what commit of the repo are you using?
Also, you should let --n_hparams
and --n_trials
take their default values to replicate our results. Please re-open if when doing so you get different numbers.
Thanks for your quick reply! I use the newest version, b953488d4dcfcc76427f07958b133b87d24a48e5 and I will take the default values of --n_hparams and --n_trials.
Another question is why we should run [0,1],[0,2],[0,3],[1,2],[1,3],[2,3]. If I understand right, it means the target domain is the mixture of domain 0 and domain 1.
Having two domains for test is necessary for the leave-one-domain-out model selection criterion.
Thank you! Is there any way to reduce the number of experiments when I want to get the accuracy under training-domain validation selection criterion? Since launching sweep.py automatically gets the results of all three criteria.
You could replace these lines by
all_test_envs = [[d] for d in range(datasets.num_environments(dataset))]
to avoid launching jobs with two test environments, only necessary for leave-one-domain-out validation.
Launching sweeps and selecting models are two completely separate processes. Selecting models is done based on a finished sweep. So first you need to decide what your sweep will contain (algorithms, datasets, environments...). Once you launch and finish your sweep, you will be able to select models from that finished sweep according to different strategies (in-domain, leave-one-out, oracle).
Again, sweeps do not have anything to do with model selection strategies. They just run random combinations of hyper-parameters and log all the results to files. Then you can run different model selection strategies on those files.
If you want to run only some subset of jobs (one particular test env), you will have to hack sweep.py
@YiDongOuYang Hi, have you found out the reason for lower sweep accuracy? I encountered the same problem. I used default "n_hparams" and "n_trials".
I follow the same command line mentioned in README.md and random choose an algorithm to get some results. However, I found my results are more then 20% below what had reported. Could you please help me to figure out the problems?
I have already recheck my PACS dataset, which is the same as "https://drive.google.com/uc?id=0B6x7gtvErXgfbF9CSk53UkRxVzg" in download.py. All command line can be found below.
python -m domainbed.scripts.sweep launch\ --data_dir=/home/guoweiyu/yidong/data/PACS\ --output_dir=/home/guoweiyu/yidong/dg/sweep\ --command_launcher multi_gpu\ --algorithms DANN\ --datasets PACS\ --n_hparams 1\ --n_trials 1
Environment: Python: 3.7.3 PyTorch: 1.3.1 Torchvision: 0.4.2 CUDA: 10.1.243 CUDNN: 7603 NumPy: 1.16.2 PIL: 5.4.1 Args: algorithm: DANN checkpoint_freq: None data_dir: /home/guoweiyu/yidong/data/PACS/ dataset: PACS holdout_fraction: 0.2 hparams: None hparams_seed: 0 output_dir: train_output save_model_every_checkpoint: False seed: 0 skip_model_save: False steps: None test_envs: [0] trial_seed: 0 HParams: batch_size: 32 beta1: 0.5 class_balanced: False d_steps_per_g_step: 1 data_augmentation: True grad_penalty: 0.0 lambda: 1.0 lr: 5e-05 lr_d: 5e-05 lr_g: 5e-05 mlp_depth: 3 mlp_dropout: 0.0 mlp_width: 256 resnet18: False resnet_dropout: 0.0 weight_decay: 0.0 weight_decay_d: 0.0 weight_decay_g: 0.0
(base) guoweiyu@bj08:~/yidong/dg/DomainBed-master$ python -m domainbed.scripts.collect_results> --input_dir=/home/guoweiyu/yidong/dg/sweep Total records: 170
-------- Dataset: PACS, model selection method: training-domain validation set Algorithm A C P S Avg
DANN 69.7 +/- 0.0 68.1 +/- 0.0 96.9 +/- 0.0 64.8 +/- 0.0 74.9
-------- Averages, model selection method: training-domain validation set Algorithm PACS Avg
DANN 74.9 +/- 0.0 74.9
-------- Dataset: PACS, model selection method: leave-one-domain-out cross-validation Algorithm A C P S Avg
DANN 40.3 +/- 0.0 68.1 +/- 0.0 94.1 +/- 0.0 64.8 +/- 0.0 66.8
-------- Averages, model selection method: leave-one-domain-out cross-validation Algorithm PACS Avg
DANN 66.8 +/- 0.0 66.8
-------- Dataset: PACS, model selection method: test-domain validation set (oracle) Algorithm A C P S Avg
DANN 21.1 +/- 0.0 18.1 +/- 0.0 29.0 +/- 0.0 18.6 +/- 0.0 21.7
-------- Averages, model selection method: test-domain validation set (oracle) Algorithm PACS Avg
DANN 21.7 +/- 0.0 21.7
(base) guoweiyu@bj08:~/yidong/dg/DomainBed-master$ Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-27-generic x86_64)