Closed hljjjmssyh closed 4 years ago
It's been a while, but if I recall correctly, all previous papers in this literature have reported right link prediction results, and we followed the same convention. If you want left-pred stats too, you can just uncomment the relevant lines in Test_NN.cpp and add some print statements. Shall I close this issue?
It's been a while, but if I recall correctly, all previous papers in this literature have reported right link prediction results, and we followed the same convention. If you want left-pred stats too, you can just uncomment the relevant lines in Test_NN.cpp and add some print statements. Shall I close this issue?
I have tested the ConvE model. It got a better result in left link prediction, but performed poorly in the right link prediction. For the dataset FB15K-237, the result of ConvE model:
2019-01-24 12:13:32.572492 (INFO): Hits left @10: 0.6045496323529411 2019-01-24 12:13:32.573047 (INFO): Hits right @10: 0.36678538602941174 2019-01-24 12:13:32.574075 (INFO): Hits @10: 0.48566750919117646 2019-01-24 12:13:32.575755 (INFO): Mean rank left: 169.81772748161765 2019-01-24 12:13:32.577392 (INFO): Mean rank right: 381.87488511029414 2019-01-24 12:13:32.580399 (INFO): Mean rank: 275.84630629595586
So, I think that the result in your paper is insufficient. Thank you!
All reported results for ConvE and R-GCN were taken directly from their original papers, as mentioned in Table 4's caption. I'm really not sure what you mean by 'result is insufficient' exactly, could you clarify which result you're talking about?
Secondly, if you test TransE, TransH, TransR (the original authors' implementation), and mnick's implementation of HolE, you can observe that they reported Right Link Prediction (RLP) results in their papers, not the mean of RLP and LLP. Again, it's been a while, and I remember I found this odd, but you can confirm it for yourself. It seems to be a (perhaps arbitrary) convention.
All reported results for ConvE and R-GCN were taken directly from their original papers, as mentioned in Table 4's caption. I'm really not sure what you mean by 'result is insufficient' exactly, could you clarify which result you're talking about?
Secondly, if you test TransE, TransH, TransR (the original authors' implementation), and mnick's implementation of HolE, you can observe that they reported Right Link Prediction (RLP) results in their papers, not the mean of RLP and LLP. Again, it's been a while, and I remember I found this odd, but you can confirm it for yourself. It seems to be a (perhaps arbitrary) convention.
I used the code from this repo https://github.com/TimDettmers/ConvE You can get the result about FB15K-237 from the Readme. So, I think the result that they reported is the mean of RLP and LLP. If they only used LLP, the result about hit@10 is 0.6045496323529411. This result is much higher than the average. However, they only reported the average in their paper. So, I reckon you should use the average of LLP and RLP to compare with other methods.
I see what you mean, thanks for pointing it out. What are your results for the other models?
On Thu, Jan 24, 2019, 9:24 PM hljjjmssyh <notifications@github.com wrote:
All reported results for ConvE and R-GCN were taken directly from their original papers, as mentioned in Table 4's caption. I'm really not sure what you mean by 'result is insufficient' exactly, could you clarify which result you're talking about?
Secondly, if you test TransE, TransH, TransR (the original authors' implementation), and mnick's implementation of HolE, you can observe that they reported Right Link Prediction (RLP) results in their papers, not the mean of RLP and LLP. Again, it's been a while, and I remember I found this odd, but you can confirm it for yourself. It seems to be a (perhaps arbitrary) convention.
I used the code from this repo https://github.com/TimDettmers/ConvE You can get the result about FB15K-237 from the Readme. So, I think the result that they reported is the mean of RLP and LLP. If they only used LLP, the result about hit@10 is 0.6045496323529411. This result is much higher than the average. However, they only reported the average in their paper. So, I reckon you should use the average of LLP and RLP to compare with other methods.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Srinivas-R/AKBC-2017-Paper-14/issues/4#issuecomment-457432385, or mute the thread https://github.com/notifications/unsubscribe-auth/AMY3k20yB8x3wlnKPITC5V9MMiZHDdEpks5vGmrjgaJpZM4aOnXP .
I noticed that you didn't use left link prediction in your code. So, I want to ask whether there is a problem with the data in your paper.