cthoyt / cthoyt.github.io

My personal website, served at https://cthoyt.com
https://cthoyt.com/
Creative Commons Attribution 4.0 International
3 stars 4 forks source link

Pythagorean Mean Rank Metrics | Biopragmatics #41

Open utterances-bot opened 2 years ago

utterances-bot commented 2 years ago

Pythagorean Mean Rank Metrics | Biopragmatics

The mean rank (MR) and mean reciprocal rank (MRR) are among the most popular metrics reported for the evaluation of knowledge graph embedding models in the link prediction task. While they are reported on very different intervals ($\text{MR} \in [1,\infty)$ and $\text{MRR} \in (0,1]$, their deep theoretical connection can be elegantly described through the lens of Pythagorean means. This blog post describes ideas Max Berrendorf shared with me that I recently implemented in PyKEEN and later wrote up as a full manuscript.

https://cthoyt.com/2021/04/19/pythagorean-mean-ranks.html

micmarcinkowski commented 2 years ago

Hi, I wonder why MR is defined such that for best possible embedding (i.e. the set of ranks for true triples is 1,2,3...) it grows with the degree of the vertex in the graph. A more natural definition would be

$$\text{MR} =\frac{1}{|\mathcal{I}|} (\sum \limits_{r \in \mathcal{I}} r - {|\mathcal{I}| \choose 2})$$

Now MR = 1 for an embedding as above.

cthoyt commented 2 years ago

That's interesting. I haven't got a chance to carefully look into this, but I think you're on to a really interesting idea to adjust these metrics for variations in graph topology. We had quite a few thoughts on that lately we described in https://arxiv.org/abs/2203.07544 (see code at https://github.com/pykeen/ranking-metrics-manuscript)