jianhao2016 / GPRGNN

132 stars 29 forks source link

Application to Graph classification #7

Closed mdanb closed 3 years ago

mdanb commented 3 years ago

How good do you think your model applies to graph classification?

elichienxD commented 3 years ago

Hi @mdanb,

Very good question! We haven't tried to apply our method to graph classification but conceptually you can add a graph pooling layer for graph classification. However, I want to point out that previously the community only define homophily measures based on "node labels" but not graph labels. Thus it is not clear what kind of benefit does GPR-GNN offers for graph classification problems except resolving over-smoothing. Furthermore, I do not notice people applying APPNP for graph classification problems. So as the first step maybe you would want to try to apply APPNP to graph classification. If successful, then applying GPR-GNN should ideally give a better result.

Let me know if you have any other questions. If not, I'll close the issue in a few days.

Thanks, Eli

mdanb commented 3 years ago

@elichienxD Thanks for the reply. I see. So how should I expect (theoretically) APPNP to perform compared to a graph agnostic baseline (i.e MLP) in the heterophily setting? It sounds like based on what you're saying, APPNP should perform better, but GPR-GNN should do even better? In other words, I'm asking what you mean by "if successful" for APPNP? What is considered "successful"?

elichienxD commented 3 years ago

Hi @mdanb,

Basically, the "successful" that I refer to is better performance than standard GNNs for graph classification. For example, GCN+some graph pooling function or GIN. It could be the case that APPNP (or models of similar type) doesn't work well in practice on graph classification when directly applied. I do not have a concrete answer on whether the result would be good or not, or how one should modify APPNP for graph classification.

Regarding your first question, one should expect APPNP to perform no better (or even worse) than MLP in the heterophily setting. We have shown in our paper, APPNP can have much worse performance than MLP for synthetic experiments on cSBM (Figure 2) and no clear advantage on real-world datasets (Table 1). Note that the trend is not very consistent on real-world datasets as situations are much more complicated.

Note that one of the main results in our paper is that APPNP corresponds to a low-pass graph filter. Hence, it can't work well theoretically for heterophily cases, as the ground truth signal (labels) are of high frequency (nodes of different labels are more likely to link). In contrast, MLP can be view as a method with no graph filtering. Thus, in the heterophily case applying low-pass graph filter is actually filtering out the "useful" graph signal which is harmful. That's why one should expect APPNP to perform worse than MLP in theory. As a last remark, our GPR-GNN can model the high-pass graph filter and thus has the ability to learn better than both APPNP and MLP, albeit there are many more factors affecting the final results (such as optimization error, generalization error ...etc).

Hope these answer your questions.

Eli

mdanb commented 3 years ago

@elichienxD thanks a lot for the detailed answer, helps a lot!