divelab / GOOD

GOOD: A Graph Out-of-Distribution Benchmark [NeurIPS 2022 Datasets and Benchmarks]
https://good.readthedocs.io/
GNU General Public License v3.0
180 stars 19 forks source link

Hi, about environmental partitioning of the dataset #28

Closed bruno686 closed 5 months ago

bruno686 commented 5 months ago

Hi, In the paper, WebKB Concept dataset is shown to have three domains, environment 3/1/1. but in the code it is shown as follows. As my understanding the training set, IID_eval, IID_test are 0,1,2 env_id, OOD_val is 3 env_id and OOD_test is 4 envid. What's going on here? And what is the difference between domain and environment? Thank you! ![17951712146260 pic](https://github.com/divelab/GOOD/assets/52364706/ca7429b9-58dc-4413-8169-48099134f27f)

CM-BF commented 5 months ago

Hi,

"-1" just indicates the environment/domain information is not available to use in validation and test sets. They are split as what you expect "IID_eval, IID_test are 0,1,2 env_id, OOD_val is 3 env_id and OOD_test is 4 env_id".

Best, Shurui

bruno686 commented 5 months ago

Hi, “"-1" just indicates the environment/domain information is not available to use in validation and test sets.” But from the code, IID_eval, IID_test should be AVAILABLE?I feel a bit confused.

CM-BF commented 5 months ago

IID_eval and IID_test are available, but their environment information is not available, i.e., we only allow the use of environment information in training. In real world testing scenarios, we cannot identify the environment partition given any sample.

bruno686 commented 5 months ago

Thank you for your reply, I understand it. For another question, what is the difference between domain and environment?

CM-BF commented 5 months ago

You are welcome. In many domain generalization cases, the domain and environment are exchangeable. However, in invariant and causal learning, environments indicate the resulted distributions after specific interventions. These interventions are more complex than the typical domains in domain generalization. Therefore, IMO, the environment concept can cover the concept of domain, but not reverse.

Best, Shurui