Notes on negative sampling in the context of static KG link prediction.

Negative sampling for computing the MRRs

Negative sampling for evaluation, i.e. for computing test MRRs is typically NOT done in static KG link prediction evaluation!

See info e.g. here: Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings and here ([11] from above): Parallel Training of Knowledge Graph Embedding Models: A Comparison of Techniques

NBFNet

Negative sampling during Training

Different commonly used approaches to train KGE models, which differ mainly in the way negative examples are generated

Training with negative sampling (NegSamp) obtains for each positive triple t = (i, k, j) from the training data a set of (pseudo-)negative triples obtained by randomly perturbing the subject, relation, or object position in t (and optionally verifying that the so-obtained triples do not exist in the KG)
1vsAll: omit sampling and take all triples that can be obtained by perturbing the subject and object positions as negative examples for t (even if these tuples exist in the KG). 1vsAll is generally more expensive than NegSamp, but it is feasible (and even surprisingly fast in practice) if the number of entities is not excessively large. (comment Julia: This is also done in TKG Forecasting Training)
KvsAll: (i) constructs batches from non-empty rows (i, k, ∗) or (∗, k, j) instead of from individual triples, and (ii) labels all such triples as either positive (occurs in training data) or negative (otherwise)
You can teach an old dog new tricks! (ICLR 2020)

For training with negative samples, different strategies exist:

Negative Sampling: Pykeen(e.g.: Uniform, Bernoulli, Pseudotyped)
Negative Sampling in Knowledge Graph Representation Learning: A Review

Preliminary Conclusion (Up to discussion)

Due to the fact that it is not done in static KG completion, I opt to not use negative sampling for evaluation of TKG Forecasting, but instead compute the MRR based on scores for all entities in the KG.

In our datasets, we do not have more entities or significantly more test triples, and thus introducing negative sampling for evaluation of TKG models is not well motivated.

Appendix: Datasets used for static KG link prediction

(https://openreview.net/pdf?id=BkxSmlBFvr)

(ultra)

Appendix: Datasets used for TKG Forecasting

shenyangHuang commented 4 months ago

identify node types and ask the model only to predict MRR for nodes of the same type as the true answer? would that be a fair evaluation?

JuliaGast commented 4 months ago

not really; if models predict wrong node types (which happens) this would not be considered at all. also, for many datasets node types are not given.

JuliaGast commented 4 months ago

[x] create negatives containing all nodes besides the nodes that are in same snapshot, i.e. n for n in (s,r,n,t) (time-aware filter setting). solved with commit #12

JuliaGast commented 4 months ago

[x] create negative samples for inverse triples also related to #11

JuliaGast commented 3 months ago

suggestions of how to potentially choose node for negative samples: combination of:

entities that have been in a triple with node, relation in previous timesteps
entities that have been in triple with this relation but not the same subject before (i.e.: same type)
random entities

JuliaGast commented 3 months ago

What about negative sampling during training?

JuliaGast commented 3 months ago

Update: After meeting with Michael on April 11th we decided to NOT do negative sampling, i.e. to do the 1-vs-all strategy.

JuliaGast / TGB2

Negative Sampling #4

Notes on negative sampling in the context of static KG link prediction.

Negative sampling for computing the MRRs

Negative sampling during Training

Preliminary Conclusion (Up to discussion)

Appendix: Datasets used for static KG link prediction

Appendix: Datasets used for TKG Forecasting