Closed AAAI-2025 closed 11 months ago
Hello, Thanks for your interest in reproducing our work. To answer your questions:
Is this issue caused by the difference between our experimental environment
I don't think the small difference in python environment will lead to such large performance gap.
the setting of the original paper?
Yes. In order to compare different state-of-the-art TTA methods as fair as possible, for online results we benchmark in the paper, we run with different combinations of lr
and n_train_steps
, and pick the best one regarding the metrics we used out of all results. Please see more details in scripts like this one we upload in exps
folder.
Or there are other reasons?
Yes, please carefully set up the value of hyper-parameters like batch_size
and check with scripts we provided in exps
folder.
Hope it helps!
Hello, thanks for your timely reply. I am going to try your suggestions recently. I would tell you soon if it works. Thanks a million.
Hi, @jiangqinting. Do you have any further feedback on this problem? If you need more help, feel free to contact us.
Dear authors, I have requests on several things. We run the code in the following environment: "--model_adaptation_method"——— "note" "--model_selection_method"——— "last_iterate" "--model_selection_method"——— "cifar10" "--model_name"——— "resnet26" "--episodic"——— "false" "--data_names"——— ("cifar10_c_deterministic-snow-5;" "cifar10_c_deterministic-brightness-5;" "cifar10_c_deterministic-fog-5;" "cifar10_c_deterministic-frost-5;" "cifar10_c_deterministic-contrast-5;" "cifar10_c_deterministic-motion_blur-5;" "cifar10_c_deterministic-glass_blur-5;" "cifar10_c_deterministic-zoom_blur-5;" "cifar10_c_deterministic-gaussian_noise-5;" "cifar10_c_deterministic-shot_noise-5;" "cifar10_c_deterministic-jpeg_compression-5;" "cifar10_c_deterministic-impulse_noise-5;" "cifar10_c_deterministic-pixelate-5;" "cifar10_c_deterministic-elastic_transform-5;" "cifar10_c_deterministic-defocus_blur-5",) "--batch_size"—— 100 "--lr"—— 1e-4 "--n_train_steps"——— 1 "--inter_domain"———“HomogeneousNoMixture” The error rate of note is 41%, which is quite different from the result in table 2 "NOTE-online"(24.0 ± 0.1) in this paper. Is this issue caused by the difference between our experimental environment and the setting of the original paper?Or there are other reasons?