Open utterances-bot opened 2 years ago
Hi, I wonder why MR is defined such that for best possible embedding (i.e. the set of ranks for true triples is 1,2,3...) it grows with the degree of the vertex in the graph. A more natural definition would be
$$\text{MR} =\frac{1}{|\mathcal{I}|} (\sum \limits_{r \in \mathcal{I}} r - {|\mathcal{I}| \choose 2})$$
Now MR = 1 for an embedding as above.
That's interesting. I haven't got a chance to carefully look into this, but I think you're on to a really interesting idea to adjust these metrics for variations in graph topology. We had quite a few thoughts on that lately we described in https://arxiv.org/abs/2203.07544 (see code at https://github.com/pykeen/ranking-metrics-manuscript)
Pythagorean Mean Rank Metrics | Biopragmatics
The mean rank (MR) and mean reciprocal rank (MRR) are among the most popular metrics reported for the evaluation of knowledge graph embedding models in the link prediction task. While they are reported on very different intervals ($\text{MR} \in [1,\infty)$ and $\text{MRR} \in (0,1]$, their deep theoretical connection can be elegantly described through the lens of Pythagorean means. This blog post describes ideas Max Berrendorf shared with me that I recently implemented in PyKEEN and later wrote up as a full manuscript.
https://cthoyt.com/2021/04/19/pythagorean-mean-ranks.html