Reopening evaluation issue

dschaehi commented 5 years ago

Hi,

I was just going through your code and found out that the training data has been augmented by adding new relations for reversed triples from the training set (correct me if I am wrong). I am not sure whether this is harmless, as this might have a regularzing effect on the weights the model learns.

Instead of adding new relations for reversing the triples, could you try the following and check whether this gives the same result?

Create d.train_data_reversed, where for each triple from d.train_data you only switch e_s and e_o and keep the relation. (So you don't create any new relations in this dataset.)
Add to class TuckER a method forward_reversed that is exactly the same as forward, but transposes the tensor W, so that the axes for e_s and e_o are switched.
When training, use forward for d.train_data and use forward_reversed for d.train_data_reversed

I think this way, one can guarantee that the evaluation is fair. It would be also interesting to know how you evaluate other models you compare with, for examples, whether you use the BCE loss and augment the training data for other models as well. This will make sure that it is is not the BCE loss or data augmentation that helps TuckER perform well.

ibalazevic commented 5 years ago

Check my answer to issue #3. Also, Figure 3 in the paper is produced by using inverse relations for both ComplEx and SimplE as well.

dschaehi commented 5 years ago

Thanks for the answer. Did you use the BCE loss for ComplEx and SimplE as well?

ibalazevic commented 5 years ago

Yes, they are trained exactly the same way.

dschaehi commented 5 years ago

OK. Cool. I think it is still worth to know what difference adding inverse relations make. If adding inverse relations increseases the performance of the model, then one should mention this in the paper.

ibalazevic commented 5 years ago

Yes, you're right. But given that we are not the ones who introduced adding inverse relations (ConvE and CP do this before us), we didn't include the corresponding ablation study in the paper. We might add it in the next iteration of the paper.

dschaehi commented 5 years ago

One more question: does ConvE introduce inverse relations in its implementation? The paper introduces something called "inverse model" but that is not related to adding inverse relations to the training set. I haven't checked the implementation, but if the implementation adds inverse relations to the training set without mentioning this in the paper, then this is a bit cheating. I just checked the CP paper again and found the following passage:

Learning and predicting with the inverse predicates, how- ever, changes the picture entirely. First, with both CP and ComplEx, we obtain significant gains in performance on all the datasets.

Given this observation by the authors of CP, it seems that it is a must to mention whether inverse relations are added to the training set or not, and do an ablation study.

ibalazevic commented 5 years ago

Yes, ConvE does introduce inverse relations but they don't mention it in the paper. Check their code (linked in the issue #3, you might need to dig a bit and go into their spodernet repo to find it, I can't remember exactly now).

Thanks for pointing that out. In fairness, the CP paper should have done that, but as I already said, we will most probably include it in the next iteration.

dschaehi commented 5 years ago

Yes, ConvE does introduce inverse relations but they don't mention it in the paper. Check their code (linked in the issue #3, you might need to dig a bit and go into their spodernet repo to find it, I can't remember exactly now).

Wow. Really? This is disappointing...

Thanks for pointing that out. In fairness, the CP paper should have done that, but as I already said, we will most probably include it in the next iteration.

Right. The CP paper should have done an ablation study. But at least they say explicitly in Section 5 that they introduce a new CP objective which includes inverse relations. So it would be nice, if the TuckER paper mentions this as well. Anyway, thanks for the clarification.

ibalazevic / TuckER

Reopening evaluation issue #7