Closed AlexHex7 closed 2 years ago
Yes. A-->C will be considered as a valid negative sample, no matter strict_negative
is set to True
or False
. Note that filtering any sample from validation / test sets is considered to be a test data leakage. That's why we only filter samples by self.fact_graph
.
Yes. A-->C will be considered as a valid negative sample, no matter
strict_negative
is set toTrue
orFalse
. Note that filtering any sample from validation / test sets is considered to be a test data leakage. That's why we only filter samples byself.fact_graph
.
@KiddoZhu Thanks for your reply. Yes! I quite agree with what you said about the test data leakage.
In fact, I'm not familiar with the research area of Knowledge Graph Reasoning. What makes me feel confused is that, the sample 'A-->C' is considered as a negative sample in training stage, while it is actually a positive sample in the valiation set, which means that it may not be predicted as a positive sample by the trained model in validation stage. Will this have a significant impact?
Not much from my experience. The goal of knowledge graph reasoning is more like to denoise / smooth the knowledge graph with learned embeddings or logic rules. That means, we can use the learned model to predict missing links, or predict some existing links to be wrong -- where the former is more meaningful in applications and therefore we only evaluate it. Besides, it is impossible to avoid using validation links as negative samples, unless you know the validation links -- which triggers data leakage.
Not much from my experience. The goal of knowledge graph reasoning is more like to denoise / smooth the knowledge graph with learned embeddings or logic rules. That means, we can use the learned model to predict missing links, or predict some existing links to be wrong -- where the former is more meaningful in applications and therefore we only evaluate it. Besides, it is impossible to avoid using validation links as negative samples, unless you know the validation links -- which triggers data leakage.
@KiddoZhu I see. Thanks!
Hi,
In the _strict_negative method function of KnowledgeGraphCompletion, if 'A-->B', 'B-->C' (A and B are entities, --> is relation) are the samples of traning set (.i.e. self.fact_graph) while 'A-->C' is the sample of valiation set, then I think 'A-->C' will be regard as a negative sample in the traning stage. Is that a problem?