divelab / DIG

A library for graph deep learning research
https://diveintographs.readthedocs.io/
GNU General Public License v3.0
1.84k stars 280 forks source link

Shapley value calculation mismatch with the SubgraphX paper #56

Closed ShichangZh closed 2 years ago

ShichangZh commented 2 years ago

Hello there,

May I ask a question about the Shapley value calculation, especially the following line of code set_exclude_mask = np.ones(num_nodes)?

https://github.com/divelab/DIG/blob/1c3dd70d5ae9887c61d7c6b37f31c6433cfe0c07/dig/xgraph/method/shapley.py#L122-L125

From the implementation above, seems like the mask is initialized as including all the nodes in the graph. Then all the selected nodes in the node_exclude_subset are included, and other nodes in the local_region are not. However, doesn't this includes all the nodes outside the local region as well?

Based on this description from the paper,

Screen Shot 2021-11-03 at 9 40 20 PM

I feel like the line 122 above should be changed to set_exclude_mask = np.zeros(num_nodes) ? Or maybe I am missing something here? I appreciate your help.

Oceanusity commented 2 years ago

Thank you for your issue.

When we want to calculate the Shapley value, we constrain the players are in the local L-hop nodes. For the nodes outside the L-hop, we don't change the them. That's to say, we keep them.

If you want to dig more mathematically details about this operation, you can refer to the paper L&C Shapley.

ShichangZh commented 2 years ago

Thank you for your quick response! Sorry, I am still a bit confused. From the Algorithm 2 in the SubgraphX paper, seems like the nodes outside of the L-hop are set to zero features?

Screen Shot 2021-11-04 at 9 12 26 AM

I think the L&C Shapley paper is doing the same thing. The Definition 1 in the paper indicates that neighbors that are L-hops away are excluded from the game (zerod-out)?

Oceanusity commented 2 years ago

It seems like the notation here should be Set nodes from $ P' \ (S \union {G_i}) $ with zeros features in the second loop line and change the V \S_i with P' \S_i in the third loop line.

For the L&C shapley, you can check its code for more details. But I think with the L-hops away nodes unchanged, it will have the same marginal contribution as that in the original shapley value when the coalition is the same. Therefore, the approximation error is smaller than zero them out.

alirezadizaji commented 2 years ago

Hi @Oceanusity , I don't understand How the modified versions mean the nodes outside player set should contribute, It is more like we couldn't judge about their contribution while in the code by setting np.ones means they are contributing. and also this is not exactly what paper said and ought to be np.zeros.

Oceanusity commented 2 years ago

Hello, for the implementation of subgraphX, we take use of the l-shapley and apply it on the graphs. In order to better minimize the difference between the l-shapley and the original shapley, we should keep the nodes outside the k-hop neighbors to be selected when calculating the marginal contributions.

For the paper, in the algorithm 2, there is a symbol conflict. Please change the Set nodes from V \ (Si ∪ {Gi}) with zero features to Set nodes from P' \ (Si ∪ {Gi}) with zero features, and Set nodes from V \ Si with zero features and feed to the GNNs f (·) to obtain f (Si) to Set nodes from P' \ Si with zero features and feed to the GNNs f (·) to obtain f (Si).

Sorry for the symbol problem, and please take this implementation as the original algorithm of the SubgraphX.

Oceanusity commented 2 years ago

Comparing the l-Shapley value and the original Shapley value, if the input and output are conditionally independent, then they are the same. Therefore, they are the same when explaining graph neural networks on node classification tasks. It's OK to set other nodes unselected or selected in this case. However, when applying graph classification, this conditional independent is unsatisfied. In this case, we have to minimize the difference between the l-Shapley value and the original Shapley value. Therefore, we follow the l-Shapley to set the nodes outside the k-hop selected.

alirezadizaji commented 2 years ago

I see thanks