Closed judgingalready closed 2 years ago
Hi, the example you give can happen but it only happens very rarely. With 20-time different random splits every round of hyperparameter search, even if something like that happens once, I don't think that will make a great difference.
Thanks for your quick reply! In my experiments, sometimes 'env2_in_acc' is 4% more(or less) than 'env2_out_acc'. It may be acceptable, and I think It would be better to control the distribution of two splited datasets to be the same.
Thank you again!
Thank you for letting us know! Please also note that the small sample size is another reason for the variation in the difference between in and out split.
Excuse me. I have some questions about model selection in CelebA_Blond.
In your paper CelebA uses test-domain validation, that means we choose the model which gets best 'env2_out_acc'. And the distribution of test environment is like this:
In the experiment holdout_fraction is set to 0.1. However, the test environment is randomly splited to 9:1. I think it may cause the distribution of two splited dataset to be inconsistent. For example:
I'm not sure if this will make a difference, or it just doesn't matter.
Looking for you reply, thanks.