Weixin-Liang / MetaShift

MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)
MIT License
108 stars 4 forks source link

How to select subsets of indoor and outdoor in subpopulation experiment #7

Closed litingfeng closed 2 years ago

litingfeng commented 2 years ago

Hi,

Thanks for contributing the great project!

I have a question regarding how you selected the subsets belonging to indoor and outdoor. While we can find corresponding subset name specified by attributes like "dog(white)" in attributes-candidate-subsets.pkl, it looks like you specified the indoor and outdoor manually in this file.

I was wondering how you generate train_set_scheme and test_set_scheme. What if we want to select cat & dog with other contexts in GENERAL_CONTEXT_ONTOLOGY, e.g., cat(bedroom)? I also noticed there is a file obj2attribute.json. Could you please provide some instructions on how to utilize it?

Thanks, Ting

Weixin-Liang commented 2 years ago

Hi, thank you for your interest in our work.

As mentioned in the paper, the subsets are selected based on the node community detection results on the meta-graph.

Section 4.1: Evaluating domain generalization To make our evaluation settings more challenging, we merge similar subsets by running Louvain community detection algorithm (Blondel et al., 2008) on each meta-graph. Node color in Figure 2 indicates the community detection result. image

In the code, the print_communities function provides the node community detection results. In practice, the community detection results tend to capture fine-grained communities. For example, although both cat(computer) and cat(sink) are indoor, they are assigned to two different communities as indicated by their node color. Therefore, for contexts that are more abstract (e.g., indoor or outdoor), you may need to manually group multiple communities together.