Closed yuchenlin closed 6 years ago
We will soon report our benchmark results.
Thanks for the prompt reply! Could you please provide a typical great setting of the hyperparameters for the FB15K dataset? The settings in the example*.py are like just for testing whether the framework can run successfully.
The settings of OpenKE-PyTorch branch are what you want.
Thanks! I will have a try soon and post my results here later!
The results of TransE on the FB15K data
overall results:
left 272.051422 0.461766 0.244469 0.081664
left(filter) 87.577713 0.726448 0.538606 0.251054
right 172.013092 0.538437 0.303804 0.104332
right(filter) 57.314960 0.779689 0.596756 0.268660
average raw MR = (272.051422 + 172.013092)/2 = 222.032
average filter MR = (87.577713 + 57.314960)/2 = 72.446
average raw Hits@10 = (0.461766 + 0.538437)/2 = 0.500
average raw Hits@10 = (0.726448 + 0.779689)/2 = 0.753
The results are quite amazing! They are much better than the results reported by the original paper by Bordes et al. (2013), which is 243, 125, 34.9, and 47.1 respectively.
Is this expected? If so, what makes the TransE in this framework work so much better than the original one? (I recall that I saw something in a paper like the implementation did some optimization. But I forgot where.)
Bordes et al. (2013) did not release their code, thus we can not know why our implementation is quite different from the results reported in the original paper = =!
@THUCSTHanxu13 I see! Thanks so much! I am trying to tune the parameters to reproduce the reported performances of methods like ComplEX. Once I got some promising results, I will post the hyper-parameters here.
Hi @yuchenlin,
I know this issue is closed, but I'm currently looking for the optimal parameters, too. Could you therefore please share the parameters especially for the WN18 dataset. The OpenKE-PyTorch (old) branch does only include them for the FB15K dataset.
Many Thanks in Advance!
Hi,
Thanks for this wonderful framework!
I was wondering whether you have reported your official results (with hyper-parameters) on the WN18 or FB15K using this framework. I am trying to reproduce the results of TransE on the two benchmark datasets, while it seems to be hard to tune for a comparable result reported by other papers.
Thus, I was thinking that you might have some benchmark results to verify that the framework could possibly reproduce similar results.