Which dataset should we use on the setting of unseen domains.

yafuly / MAGE

Apache License 2.0

165 stars 10 forks source link

Which dataset should we use on the setting of unseen domains. #1

Closed ryuryukke closed 1 year ago

ryuryukke commented 1 year ago

Hi, I try to evaluate a detector on the setting of "unseen domains". Then, I found the two test data: test_ood.csv and test.csv. Which data should we use for this?

yafuly commented 1 year ago

Hi,

Thanks for your interest in our work.

The 'test.csv' is the random split used for testing within the distribution, while 'test_ood.csv' is the out-of-distribution test set (for new domains or new models). Therefore, to evaluate the robustness of your detector on unseen domains, you should use 'test_ood.csv'.

ryuryukke commented 1 year ago

Thank you for your help. Totally understood that :)