Closed lvguofeng closed 3 years ago
We follow the standard definition of Hits@K calculation: see e.g., http://nlpprogress.com/english/relation_prediction.html
For each positive edge, we see whether it can be ranked among the top K of the negative edges. Hit@K is the ratio of the positive edges that are ranked successfully by the model.
Hello,
I read your code about hits@k.
I am not sure whether it is the Hits@K that I understand is wrong. Your implementation is: if the K-th largest pred in the negative sample is npk (npk = Top-k(neg_pred)[k]), then:
(That is, the proportion of larger positive samples (> npk) in all positive samples. )
I understand that Hits@K is the proportion of positive samples among the top k values in neg_pred+pospred, <a href="https://www.codecogs.com/eqnedit.php?latex=Hits@K&space;=&space;\textbf{len}(&space;[\textbf{Top-k}(neg{pred}&space;+&space;pos_{pred})&space;\in&space;\textbf{Pos}]&space;)&space;/&space;K." target="blank"><img src="https://latex.codecogs.com/gif.latex?Hits@K&space;=&space;\textbf{len}(&space;[\textbf{Top-k}(neg{pred}&space;+&space;pos{pred})&space;\in&space;\textbf{Pos}]&space;)&space;/&space;K." title="Hits@K = \textbf{len}( [\textbf{Top-k}(neg{pred} + pos_{pred}) \in \textbf{Pos}] ) / K." />
which is the real Hits@K, and in the link prediction task, which of the above evaluation is better? The formula is not well written, I hope it can be understood.