Closed Wentao-Xu closed 3 years ago
Hi @Wentao-Xu ,
We use the formulation Re(\overline{h}Rt^\top) for notation convenience in the whole paper. Actually, we can implement ComplEx using both formulations. The key of ComplEx is the antisymmetric score functions instead of the place of conjugation. Since both the real and the imaginary parts of embeddings are learnable parameters, implementations with these two formulations will share the same performance.
It is the definition of dot products in complex spaces (see wikipedia). We unify the formulations of tensor factorization based KGC models using a dot product between two complex vectors, while the authors of ComplEx use a component-wise multilinear dot product. The two formulations are the same when the relational matrices are diagonal and complex.
Thanks
Thanks for your response,
I have read your code, and your code uses the scoring function Re(<h, r, \overline{t}>), but equation 2 Re(\overline{h}Rt^\top) are not corresponding to your code although they share the same performance. (Re(h) + Im(h) i) (Re(r) + Im(r) i) (Re(t) - Im(h) i) and (Re(h) - Im(h) i) (Re(r) + Im(r) i) (Re(t) + Im(h) i) are different. Probably Re(hR\overline{t}^\top) is more accurate?
Thanks for pointing out the <u, v> represents the inner products of two complex vectors. But I still do not understand what the h\overline{r} mean in Re(<h\overline{r}, t>), do you mean the h\overline{r} is the dot product between h and \overline{r} ? or h\overline{r} = (Re(h) + Im(h) i) (Re(r) + Im(r) i) ?
Looking forward to your response again.
Hi,
You can just think that we parameterize the negative imaginary parts of entity embeddings. Then the score function will be the same as that in our code.
$h\overline{R}$ is the multiplication between a complex vector and a complex matrix, and the result is a complex vector. When $R$ is diagonal, it is equivalent to the element-wise product between $h$ and the $r$, when $r$ is a vector consisting of the diagonal elements of $R$.
Thanks
Thanks for your detailed response, I probably understand what you mean. \overline{h} is [h_0, -h_1 i], R is [[r_0, 0],[0, r_1 i]], and t is \overline{h} is [t_0, t_1 i]. And do you mean <h\overline{R}, t> = <(h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i, t_0 + t_1 i>?
To be honest, it is really hard to understand, Maybe there should be more introduction in the paper (e.g., more details in Section 2 Preliminaries)? Your code is not directly corresponding to your paper (although they doing the same thing). In your paper, the matrix representation R_j of the relation r_j is a matrix, but the representation of relation r_j in your code is a vector.
Yes, I think h\overline{R} is not an ordinary multiplication if h is a complex vector and \overline{R} is a diagonal matrix.
But I still not understand why h\overline{R} = [h_0, h_1 i] [[r_0, 0],[0, -r_1 i]] = h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i? Could you provide more details about the multiplication you defined?
ok, thanks for your reply. h is h_0+h_1 i, and R is [[r_0+r_1 i]], the conjugate matrix \overline{R} is [[r_0 - r_1 i]], so: h \overline{R} = [h_0+h_1 i] [[r_0- r_1 i]] = h_0r_0 + h_1r_1 + (h_1r_0 - h_0r_1) i, but this is not the (h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i we want. Did I understand something wrong again?
Yes, h \overline{R} = [h_0+h_1 i] [[r_0- r_1 i]] = h_0r_0 + h_1r_1 + (h_1r_0 - h_0r_1) i. It leads to an equivalent formulation of ComplEx, as we what we have discussed before.
In our paper, the dot product between two complex vectors u and v are <u, v>=\overline{u} t^\top (see Equation 2). Thus, when taking the dot product between h\overline{R} and t, h\overline{R} actually works as h_0r_0 + h_1r_1 + (-h_1r_0 + h_0r_1) i.
I have mentioned that, you can just think that we parameterize the negative imaginary parts (-h_1) of entity embeddings. In this way, to implement h_0r_0 + h_1r_1 + (-h_1r_0 + h_0r_1) i, the code will be (h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i.
ok, I totally understand. The notation h_1 in your paper is not corresponding to vector h_1 in your code, but corresponding to the vector (-h1) in your code. The other problem is if you parameterize the negative imaginary parts (-h_1) of entity embeddings, since the tail entity t shares the same embedding as head entity h, do you also parameterize the negative imaginary parts (-t_1) of entity embeddings?
That is, given a 4000 dimension complex vector [e_0, e_1] of the embedding h or t. (h and t are the same entity (e.g., lion) but in different positions). The real embedding for the head entity h is [e_0, -e_1], but the tail entity's embedding t should also be the [e_0, -e_1] since h and t are the same entity.
Yes. That's why there is no conjunction for t as that in the original ComplEx paper.
But why do you do this transformation? Why do you not make the notation in your paper correspond to the code? This transformation makes the paper harder to understand, and I can not understand if you do not provide such a detailed explanation. 哈哈哈,我真的被你绕晕了,paper里面都没有讲虚部的参数都取了个负号,搞到我在纸上推了好久都没推出你paper里公式的结果。
I have also mentioned that, we use the formulation Re(\overline{h}Rt^\top) for notation convenience in the whole paper : ). Moreover, the notations in our paper are self-consistent and equivalent to the implementations in our code.
2333,这种实现上的细节写在 paper 里会有更多人看不懂吧。
All right, but my first impression of this paper is why Equation 2 is different from the scoring function of ComplEx in ICML 2016 or ICML 2018, and the reason is you parameterize the negative imaginary parts (-h_1) or (-t_1) of entity embedding.
In a word, I think more clarifications are definitely necessary.
Hi, thanks for sharing the code, I have some questions about Equation 2 in this paper.
Looking forward to your response.