deeplearning-wisc / hypo

13 stars 0 forks source link

the result on PACS dataset is not reproduced. #4

Open JungHunOh opened 5 months ago

JungHunOh commented 5 months ago

Hello,

I'm currently working on replicating the results from the PACS dataset using your code.

However, I've encountered a discrepancy as the obtained accuracies fall significantly below those reported in the paper.

For instance, when targeting the 'cartoon' domain, I achieved an accuracy of 79.78% on the PACS dataset. (reported accuracy = 82.3%)

I executed the script/train_hypo_dg.sh file.

I've not checked if there are similar issues on the other datasets.

Here is the training log: https://wandb.ai/junghunoh/hypo/reports/PACS-target-domain-cartoon---Vmlldzo3NDY5NTkw

Here are the training arguments. {'augment': True, 'batch_size': 64, 'bottleneck': True, 'cosine': True, 'epochs': 50, 'feat_dim': 512, 'gpu': 0, 'head': 'mlp', 'id_loc': 'datasets/PACS', 'in_dataset': 'PACS', 'learning_rate': 0.0005, 'loss': 'hypo', 'lr_decay_epochs': '100,150,180', 'lr_decay_rate': 0.1, 'mode': 'online', 'model': 'resnet50', 'momentum': 0.9, 'normalize': False, 'prefetch': 4, 'print_freq': 10, 'proto_m': 0.95, 'save_epoch': 100, 'seed': 4, 'start_epoch': 0, 'target_domain': 'cartoon', 'temp': 0.1, 'trial': '0', 'use_domain': False, 'w': 2.0, 'warm': False, 'weight_decay': 0.0001}

I would appreciate if you check this issue.

Thanks.

YixuanLi commented 4 months ago

Thanks for flagging this issue, which we take seriously. The first authors Haoyue and Yifei will follow up.

JungHunOh commented 4 months ago

Thanks for your answer. Please be aware that the training log is currently not shown since I deleted them by mistake in the wandb server. Instead, I share the log file directly here. output.log

HaoyueBaiZJU commented 4 months ago

Hello, Thank you for your interest in our work. The script/train_hypo_dg.sh file includes default hyperparameters. However, we conduct hyperparameter tuning following common practice in DomainBed. The optimal hyperparameters vary by domain; for example, the best 'lr' for the cartoon domain is 0.0005, with 'batch_size' of 32 and 'w' of 4.0. Please refer to the appendix of our paper for a detailed range of hyperparameters. We also updated our script for the hyperparameters of each domain.