Re-Organize explanation examples, more effective evaluation of explanations

Sutongtong233 commented 9 months ago

🚀 The feature, motivation and pitch

Explanation for graph data is not as intuitive as images, therefore, proper evaluation is very important. The current explanation examples are not well-organized, for example:

gnn_explainer_ba_shapes.py: for this synthetic data, rather than ROC AUC metrics, visualization of explanation as shown in the GNNExplainer paper is more important (since we know the ground-turth: motif should be selected).
gnn_explainer.py, captum_explainer.py, and graphmask_explainer.py (utilizing the Cora dataset): Visualization without context is futile for the Cora dataset, as we lack knowledge of the real-world significance corresponding to each node. A more effective approach would be to employ fidelity metrics for comparing all of these explanatory methods.
Rather than Cora, MUTAG dataset used in GNNExplainer is more proper for evaluation, since we know the physical meaning of each node. Currently, MUTAG in PYG do not contain enough information. It's possible that certain details have been omitted during processing.

I want to contribute to PYG as follows:

Re-organize explanation examples. For different kinds of graph dataset(real world/synthetic), we should select proper evaluation ways(visualization/metrics)
Add data processing code for raw MUTAG dataset used in GNNExplainer.
Include all synthetic data referenced in GNNExplainer. Currently, grid motifs and tree-based datasets are not incorporated.
SubgraphX has not been included in PYG, even though it is on TODO for a long time. If feasible, I would like to undertake this implementation.

Alternatives

No response

Additional context

No response

rusty1s commented 9 months ago

+1 to all your points. We definitely need to provide metric support in some of these examples, and better visualizations is a welcome addition as well.

Can you clarify on point 3 though? What's the difference between the two datasets?

Sutongtong233 commented 9 months ago

For MUTAG in PYG:

there is only ONE graph, with "edge index" and "edge type" information. According to the documentation, the data processing comes from Modeling Relational Data with Graph Convolutional Network, seems like here MUTAG is a Relational Graph. For MUTAG dataset used in GNNExplainer, running code in original GNNExplainer repo:

there are list of graphs(4336 graphs), which is used for graph classification task in GNNExplainer. Each graph contains a graph label(mutagen or nonmutagen), edge type(valence type) and node type(chemistry atom type).

rusty1s commented 9 months ago

You probably want to take a look at TUDataset(name="MUTAG").

Sutongtong233 commented 9 months ago

Thank you for your answer, I found that this issue has been resolved. The dataset used in GNNExplainer is called Mutagenicity, not MUTAG. They are two different dataset according to https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets:

Both of them can be loaded through TUDataset. I want to create an example on this dataset, since explanation on this real world dataset is more intuitive than Cora, more persuasive than synthetic dataset.

rusty1s commented 9 months ago

Yes, this sounds good :)

pyg-team / pytorch_geometric