Loss Function of Trans_L1_Model

navdeepkjohal commented 5 years ago

Hello,

I might be wrong, but is the self.loss function of the Trans_L1_Model correctly implemented in line 453 of model.py especially the reshape() part.

I am doubting it because lets say i have batch_size = 2, neg_ratio = 4 and the portion of the code just before reshape will give matrix as in figure (a) in the attached figure Picture1.pdf where c1, c3, c5, c7 represent the corruption of same true triple and c2, c4, c6, c8 represent corruption of another triple.

When we reshape it by calling reshape((int(batch_size),int(neg_ratio)) as in the current code, we get something as in figure (b). If this is followed by sum along dimension 1, then the two different type of triples's data is being added to each other as in figure (c) (first column has sum = c1+c2+c3+c4 where c1, c3 belong to one type of triple is being added to c2, c4 which is another type).

On the other hand if we implement it as reshape(neg_ratio, batch_size) followed by sum along dimension 0, then i adds only one type of triple's data to each other.

please correct me if my understanding of this code is wrong.

Thanks in advance Navdeep

ttrouill commented 5 years ago

Hi Navdeep!

Thank you for digging into this, indeed this not correct, my bad, I just pushed a fix. Though I just realized that using neg_ratio>1 on TransE makes things worse than neg_ratio=1, even after the fix, so maybe the right way to do it is actually not to average inside the loss, which, now that I think about it is indeed less constraining than creating neg_ratio pairs by replicating the same positive with each generated negative, which actually amounts to simply making more iterations with neg_ratio=1 ;)

Can't remember why I did that that way though, sorry for that '^^

ttrouill commented 5 years ago

In the end I just pushed another fix that removes this averaging of the negative scores inside the loss and simply does max_iter *= neg_ratio when neg_ratio > 1 for TransE, that's the best way to do it and that's what actually works. Thanks again for bringing this up!

ttrouill / complex

Loss Function of Trans_L1_Model #14