Task

Which benchmark performance more accurately reflects the practical effectiveness of the link prediction?

I approach the question by testing the link prediction methods with link retrieval tasks as is outlined in https://www.pnas.org/doi/10.1073/pnas.2312527121 In short, it tests a method by asking to find k most probable edges for each node, computing a score defined by max(precision, recall) for the prediction, and take the average of the scores over all nodes. This benchmark mirrors the practical setting more than the classification benchmarks.

Now, I have three benchmark results:

Link classification benchmark based on uniform sampling
Link classification benchmark based on biased sampling
Link retrieval benchmark based on the PNAS paper.

I then compare the ranking of methods for each network and quantify the similarity based on Rank Biased Overlap. It computes the similarity of two ordered lists, just like the Spearman correlation does, but with higher weights on the top performing methods.

Results

The ranking correlation is computed for each network data. And each circle represents a data.

skojaku / degree-corrected-link-prediction-benchmark

Correlation between the benchmark performance and link retrieval performance #64

Task

Results