CCIIPLab / CET

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.
12 stars 3 forks source link

Why for valid, you use the all_true as the label? #2

Closed zhiweihu1103 closed 2 years ago

zhiweihu1103 commented 2 years ago

Why for valid, you use the all_true as the label? For all_true, there contains the test labels, which may have data leakage.

CCIIPLab commented 2 years ago

We follow the “Filter” setting proposed by Bordes et at. [1] during evaluation. For Each test sample (e, t) in the test set, we first calculate the relevance score between e and every type, then rank all the types in descending order of relevance score. All the known types of e in the training, validation and test sets are removed from the ranking.

A more rigorous experimental setting is to only filter the (entity, type) tuples that appear in the training and validation sets during validation (still using the original “Filter” settings during testing). According to our experiment, this setting does not bring significant change to the results:

Dataset FB15kET YAGO43kET
MRR 0.702 0.502
MR 18 242
Hit@1 0.621 0.400
Hit@3 0.746 0.562
Hit@10 0.859 0.689

Please feel free to contact us if you have any other questions.

[1] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems, pp. 2787–2795, 2013.