Add explanation of the hits@k and mean value evaluation metrics to "Train and Evaluate Models.ipynb"

seffnet / seffnet

Network representation learning on drug-target-side effects-indication graphs for side effect prediction

https://seffnet.readthedocs.io

MIT License

12 stars 3 forks source link

Add explanation of the hits@k and mean value evaluation metrics to "Train and Evaluate Models.ipynb" #1

Closed cthoyt closed 5 years ago

cthoyt commented 5 years ago

Make sure that someone reading this notebook can follow along, especially if they're not familiar with this type of machine learning.

@mali-git would probably be happy to send you a couple references that explain it really well. Make sure you include these citation(s) in the notebook, too.

dhimmel commented 3 years ago

I didn't see any info describing hits@k in Train and Evaluate Translational Models.ipynb.

Can you point to somewhere that defines this metric? I am currently understanding it as of the top k predictions, the number that were true positives. But it seems to be a proportion, so maybe its the precision of the top k predictions?

Anyways, I got to this issue because I googled hits@k after I saw it in:

Understanding the Performance of Knowledge Graph Embeddings in Drug Discovery Stephen Bonner, Ian P Barrett, Cheng Ye, Rowan Swiers, Ola Engkvist, Charles Tapley Hoyt, William L Hamilton arXiv (2021-06-08) https://arxiv.org/abs/2105.10488

dhimmel commented 3 years ago

Ah I found the description in https://arxiv.org/abs/2006.13365

Hits@K: Hits@K denotes the ratio of the test triples that have been ranked among the top k triples, i.e.,

So this is the percent of all positives in the top k predictions. This is the same as recall@k?

cthoyt commented 3 years ago

Lucky for you, I am also a life scientist who hates the way computer scientists write about this stuff (even my co-authors of PyKEEN are culprits 😅 ), and I wrote some educational material on this:

https://pykeen.readthedocs.io/en/stable/tutorial/understanding_evaluation.html

Btw I've never seen anyone talk about recall@k. Where did you come across this?

dhimmel commented 3 years ago

Btw I've never seen anyone talk about recall@k. Where did you come across this?

I've seen @eric-czech use this term. Also seems to be somewhat common as per this blog post. @k is denoting the threshold, i.e. top k predictions. And hits/recall is denoting the metric. Recall seems a bit more natural here since it's a proportion whereas the word "hits" implies a count. Finally, there is the precision-recall curve, which is common.

The docs are very helpful! Thanks for writing them up. Under hits@k, the following sounds correct:

The hits@k describes the fraction of true entities that appear in the first k entities of the sorted rank list.

However, the next sentence sounds wrong:

For example, if Google shows 20 results on the first page, then the percentage of results that are relevant is the hits@20.

This sentence seems to be describing precision@20 (what percent of the top 20 results are relevant) rather than recall@20.

mhoangvslev commented 2 years ago

@dhimmel I came to the same conclusion as you: Since hit represents the rate of true positives (prediction matches ground-truth) is evaluated over the prediction set, hit@k is equivalent to precision@k.

precision = tp / (tp + fp)
prediction = tp + fp