snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 397 forks source link

Is 'to_symmetric' on Arxiv dataset appropriate? #244

Closed koooooooook closed 2 years ago

koooooooook commented 2 years ago

In this code(gnn.py), you used 'to_symmtric' to make the adjacency matrix of the Arxiv dataset undirected, but I have two questions about it.

First of all, the Arxiv dataset is a directed graph, according to citation relationship of papers. So, it has published year information of those papers. But here, if we undirect the adjacency matrix to use GraphSAGE (or other GNN models), doesn't the citation relationship change?

Secondly, in the case of GraphSAGE or GAT with multiple layers, is there no chance to Valid and Test nodes be used as message passing? If so, isn't it a kind of cheating even if the label of the nodes are not shown?

weihua916 commented 2 years ago

It is just for simplicity. See https://github.com/snap-stanford/ogb/issues/58