Closed zozo170610 closed 1 year ago
Hello,
- is scoring matmul process right? in relate article, first calcualte circular convolution between subject and object, then matmul with relation vector but your code first calculate circular convolution between subject and relation vector then matmul with object vector
According to the original article, the score of a triple $f(s,p,o) = \sigma(\boldsymbol{r}_p^T(\boldsymbol{e}_s \star \boldsymbol{e}_o))$.
So $f(h,r,t) = \sigma\left(\Sigma_{k=0}^{d-1} \boldsymbol{r}_k \times (\boldsymbol{h}\star \boldsymbol{t})_k\right)$ and as
$$(\boldsymbol{h}\star\boldsymbol{t})k =\Sigma {i=0}^{d-1} \mathbf{h}i \mathbf{t}{i+k \text{ mod }d}$$
then
$$f(h,r,t) = \sigma \left( \Sigma{k=0}^{d-1} \Sigma{i=0}^{d-1} \boldsymbol{h}_i \boldsymbol{r}k \boldsymbol{t}{i+k \text{ mod }d} \right)$$
Now the code available is
hr = matmul(h.view(-1, 1, self.emb_dim), r)
(hr.view(-1, self.emb_dim) * t).sum(dim=1)
Removing the first dimension (corresponding to the batch, this comes down to computing for each sample $\boldsymbol{h}^T M_r \boldsymbol{t}$ where $M_r$ is a matrix such that $Mr[i,j] = x{j-i \text{ mod }d}$ (there is indeed a mistake in the docstring of the static method get_rolling_matrix
). As a result
$$\boldsymbol{h}^T Mr \boldsymbol{t} = \Sigma{i=0}^{d-1} \boldsymbol{h}_i \times (M_r\boldsymbol{t})i = \Sigma{i=0}^{d-1} \boldsymbol{h}i \times (\Sigma{k=0}^{d-1} M_r[i,k] \boldsymbol{t}k) = \Sigma{i=0}^{d-1} \Sigma_{k=0}^{d-1} \boldsymbol{h}i \boldsymbol{r}{k - i \text{ mod }d} \boldsymbol{t}_k$$
Which leads with a change of variable to
$$\boldsymbol{h}^T Mr \boldsymbol{t} = \Sigma{i=0}^{d-1} \Sigma_{u=0}^{d-1} \boldsymbol{h}i \boldsymbol{r}{u} \boldsymbol{t}_{u + i \text{ mod }d}$$
This is the same expression as the one of the article. Does that answer your question ?
- after mod shifting relation matrix, e.g relation matrix shape (2, 3) → (2,3,3), when matmul with head embeding vector your code, It is put into multiplication by column. Isn't it right that it should be changed by row?
I believe the confusion comes from the mixup in the docstring of get_rolling_matrix
. Right ?
I assume this is no longer an issue.
I'm going to use HOLE model in your code.
before i use, i have few question. expecially in scoring_function.
1) is scoring matmul process right? in relate article, first calcualte circular convolution between subject and object, then matmul with relation vector but your code first calculate circular convolution between subject and relation vector then matmul with object vector
2) after mod shifting relation matrix, e.g relation matrix shape (2, 3) → (2,3,3), when matmul with head embeding vector your code, It is put into multiplication by column. Isn't it right that it should be changed by row?
e.g. it is matrix after mod shifting [[1,1,1], [2,2,2], [3,3,3]]
when matmul with head vector, it calculated by column [1, ,2, 3] not [1,1,1] ...