facebookresearch / ssl-relation-prediction

Simple yet SoTA Knowledge Graph Embeddings.
Other
109 stars 19 forks source link

1vsAll objective and reciprocal triples #19

Closed loreloc closed 1 year ago

loreloc commented 1 year ago

Hi, I have noticed that in your experiments the flag --score_lhs is not enabled, and this flag includes the component $-\log P_\theta(s\mid p,o)$ into loss. In contrast, the 1vsAll objective includes this conditional likelihood, so it seems there is a discrepancy between the objective function in the paper (where there is a conditioning on the subjects) and the one used here.

Is it because you augment the data set with reciprocal triples? If so, is this equivalent to assuming that $P\theta(S=s\mid R=p,O=o) = P\theta(O=s\mid R=p^{-1},S=o)$, where $r^{-1}$ denotes the inverse relation?

Thank you

yihong-chen commented 1 year ago

@loreloc Hi Lorenzo, you are right. We did augment the data set with reciprocal triples as empirically we found that it is better than using score_lhs. Let me know if you have further questions.

loreloc commented 1 year ago

Hi @yihong-chen, thank you for your answer.

So do you confirm that the loss stated in the paper (Eq. 2) is not exactly the one used in the experiments, but it is actually the one showed in Lacroix et. al (2018) (Eq. 7) with the addition of the relation prediction auxiliary?

yihong-chen commented 1 year ago

Hi @loreloc We have two implementations in our codebase, with- and without- reciprocal triples. The --score_lhs should be turned on if you are not using reciprocal triples. We also derive our objective (Eq.2) using this setting, as it is more clear to see the underlying idea of "perturbing every position". This view of "perturbing every position" is very similar to masked language modelling in NLP, if you treat each position (subject/predicate/object) as one token and mask it.

Our reported results are with reciprocal triples. So you are right, it is Lacroix et. al (2018) (Eq. 7) + the relation prediction auxiliary. In general, using reciprocal triples is a very useful trick as observed both in Dettmers et al., 2018 and Lacroix et. al (2018).

Let me know if there is anything else I can help.

loreloc commented 1 year ago

Thank you! I think this can be closed.