pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
20.57k stars 3.57k forks source link

[Roadmap] Heterogeneous Graphs Explainability Support #9112

Open Mellandd opened 3 months ago

Mellandd commented 3 months ago

🚀 The feature, motivation and pitch

Explainability is a key feature of GNNs, which is already implemented in PyG. However, of all the features introduced, only a few have been adapted to heterogeneous graphs.

Algorithms: Of all the algorithms implemented for explainability, only the Captum algorithm is compatible with heterogeneous graphs. It would be interesting to adapt other specific algorithms for graphs such as GNNExplainer or PGExplainer, and other algorithms such as AttentionExplainer.

Moreover, the algorithms adapted by PyG could be extended with new algorithms that have been published over the years (for example, see this survey), but this isn't just for heterogeneous graphs. Maybe in the future, we could work on creating new algorithms simultaneously for heterogeneous graphs, without this gap.

Features: Some features available in the explanations of homogeneous GNNs are missing in heterogeneous GNNs. For example, the visualize_graph method of Explanation is not available for HeteroExplanation. This right now can be done with the get_explanation_subgraph method, and generating the plot by hand with NetworkX, but it would be nice to implement it to do it automatically.

Metrics: Currently, the available metrics such as Fidelity or Faithfulness are only available for homogeneous graphs, but those metrics could be adapted for heterogeneous graphs. To continue the work of #5628, we could think implementing new metrics for all kind of graphs, such as sparsity or stability (e.g. see https://arxiv.org/pdf/2012.15445.pdf)

Alternatives

No response

Additional context

No response

rusty1s commented 3 months ago

@wsad1 FYI

rachitk commented 3 months ago

As a heads up, #8512 seeks to make using Captum possible for some specific heterogeneous convolution layers for which the current implementation doesn't quite work (HANConv and HGTConv). I'm planning to update the PR to the current main branch soon. We might need to have a similar approach for the other explainers (special implementations due to the differing ways HANConv and HGTConv initialize their edge indices when propagating)

I've also been toying around with implementing a heterogeneous version of GNNExplainer and PGExplainer based on DGL's own heterogeneous implementations of these, though I haven't been totally successful yet (and I think there are a few things we can improve in terms of their implementation).

Mellandd commented 3 months ago

Hello, @rusty1s do you think I could help with this? which task do you think is good for a first issue?

rusty1s commented 3 months ago

I think the most important feature is to support GNNExplainer for heterogeneous graphs. Hopefully it shouldn't be hard to do since we just need to make sure that edge_mask and node_mask are created for every edge/node type.