Closed sumitpai closed 4 years ago
I think it may be interesting to look at this paper as well: https://arxiv.org/abs/2002.06914.
They call worst/middle/best ranking pessimistic/realistic/optimistic and also list methods used for many existing models. Additionally, they introduce a new ranking mechanism called Adjusted Mean Rank (AMR) which should make it possible to compare results between datasets or different train/test splits on the same dataset.
Fixed by pull request #214
Background and Context In AmpliGraph, while evaluating corruptions, if the test set triple gets same score as any of the corruptions, we assign the worst rank. There are other approaches followed in literature. We should implement all these strategies to allow the user to compare model performance using all three approaches. Description Let's look at each of the three strategies in detail with an example.
Assume there are only 10 corruptions, and assume that all the corruptions get the same score as the test triple. The ranks assigned by the three strategies are as follows:
References: