Closed nomisto closed 2 years ago
Can you rephrase your question? There is no negative sampling in the evaluation.
Sorry, I think I've just found the answer in the appendix of your paper.
What I meant is: What is the size of f.e. the set of corrupted triples (h,r,t') among which the score of the true triple (h,r,t) is ranked? According to your paper it is, what I assumed, all possible entities , where is the set of all entities in the dataset.
Some frameworks allow to specify the size of the set of randomly corrupted triples within the true triple is ranked.
F.e. dgl-ke
https://dglke.dgl.ai/doc/eval.html
--neg_sample_size_eval NEG_SAMPLE_SIZE_EVAL Negative sampling size for testing
or sometimes there corruptions can be restricted to types so that type(t) == type(t').
We did not add negative sampling to evaluation, because it (i) it makes evaluation non-deterministic and (ii) it leads to misleading results (e.g., see here, Tab. 6). Both things conflict with the goals of LibKGE.
That being said, the feature is implemented in Dist-KGE, where it is called "rank against".
I am sorry if I missed this in the documentation! Unfortunately I couldn't find anything about that.
I wanted to verify that the number of negative samples during evaluation is the number of all entities (i.e. every entity in the dataset can be an answer). Am I right in this assumption?
Greetings, Simon