uma-pi1 / kge

LibKGE - A knowledge graph embedding library for reproducible research
MIT License
776 stars 124 forks source link

Number of negative samples during evaluation #257

Closed nomisto closed 2 years ago

nomisto commented 2 years ago

I am sorry if I missed this in the documentation! Unfortunately I couldn't find anything about that.

I wanted to verify that the number of negative samples during evaluation is the number of all entities (i.e. every entity in the dataset can be an answer). Am I right in this assumption?

Greetings, Simon

rgemulla commented 2 years ago

Can you rephrase your question? There is no negative sampling in the evaluation.

nomisto commented 2 years ago

Sorry, I think I've just found the answer in the appendix of your paper.

What I meant is: What is the size of f.e. the set of corrupted triples (h,r,t') among which the score of the true triple (h,r,t) is ranked? According to your paper it is, what I assumed, all possible entities , where is the set of all entities in the dataset.

Some frameworks allow to specify the size of the set of randomly corrupted triples within the true triple is ranked.

F.e. dgl-ke https://dglke.dgl.ai/doc/eval.html --neg_sample_size_eval NEG_SAMPLE_SIZE_EVAL Negative sampling size for testing

or sometimes there corruptions can be restricted to types so that type(t) == type(t').

rgemulla commented 2 years ago

We did not add negative sampling to evaluation, because it (i) it makes evaluation non-deterministic and (ii) it leads to misleading results (e.g., see here, Tab. 6). Both things conflict with the goals of LibKGE.

That being said, the feature is implemented in Dist-KGE, where it is called "rank against".