skojaku / degree-corrected-link-prediction-benchmark

Link prediction
MIT License
3 stars 0 forks source link

Fig make fig1 #43

Closed skojaku closed 1 year ago

skojaku commented 1 year ago

Figure 1:

Screen Shot 2023-05-08 at 9 52 35 PM

This figure illustrates the issue of uniform negative sampling. The schematics in the left panels. The third panel shows the degree-degree distributions of positive and negative edges, which demonstrates that the edges can be easily distinguished solely by degree. The rightmost panel shows the performance comparison of the preferential attachment and node2vec for 89 networks evaluated based on the classification vs search task, showing that the evaluation based on the uniform sampling does not align with the performance in the search task.

skojaku commented 1 year ago

Figure 2:

Screen Shot 2023-05-08 at 9 54 17 PM

This figure illustrates that the proposed bias aligned sampling is better aligned with the evaluation for the search task. (A) The degree distributions are indistinguishable. (B) Better alignment. (C) The larger similarity of method ranking to the method ranking for the search task.

note In the search task concerned, one asked the algorithm to provide 50 most probably edges to other nodes. We then calculate the precision/recall for individual nodes, which is then averaged over all nodes. I've checked that the results are consistent across other metrics such as F1 and recall as well as their micro versions.

skojaku commented 1 year ago

I wish to have Figure 3, where we train GNNs and show that the negative bais sampling is a better training method.

skojaku commented 1 year ago

There are already a bunch of figures for robustness check, which will be put in the SI

skojaku commented 1 year ago

We may want to change the wording: "degree biased" -> "bias aligned" since "biased sampling" doesn't sound good for evaluation. "Bias aligned" sounds a good thing.

rachithaiyappa commented 1 year ago
skojaku commented 1 year ago

Oh, yea. I should have explained that. Since the ranking is discrete, it often happens that multiple networks appear on the same coordinate on top of each other (an occlusion problem). So the size here shows the number of networks, with a larger circle means more networks.

skojaku commented 1 year ago

Rank bias overlap is a ranking similarity metric, with stronger weights on top rankers. We can say "Ranking similarity (RBO)" in the y-axis, since RBO, although it is cited within some fields like neuro science, is not well-known, as you pointed out.

skojaku commented 1 year ago

I'll reorganize the figures, correct typos, rewording the labels. So please feel free to leave suggestions and comments!