Link prediction evaluation issue

Hello,

We think there may be an issue with the evaluation of the link prediction AUC/AP scores in your code. In run_lp.py, you symmetrize the positive val and test edges: data.val_pos_edge_index = gutils.to_undirected(data.val_pos_edge_index). However, the negative val and test edges do not get symmetrized. As a result, each positive edge appears with two copies, but each negative edge with only one copy, e.g., for the Cora dataset, you end up with 526 positive and 263 negative val edges, and 1054 positive and 527 negative test edges.

The AP score is sensitive to this change: making two copies of all positive samples causes the evaluation score to increase. In fact, when we ran the code with the negative val/test edges symmetrized as well, we got results a few % lower.

The code for other work that studies link prediction benchmarks (that you compare with in the readme / Table 8), including the reference code of tkipf/gae, is evaluating using an equal number of positive and negative edges.

We'd appreciate some clarification on this. Is it a bug?

graph-star-team / graph_star

Link prediction evaluation issue #7