Closed vardaan123 closed 2 years ago
Could you check those filtered_with_test
metrics? I noticed that there is a big gap between your dev and test results, which could be for this reason.
Thanks for pointing it out. Using filtered_with_test
metrics, I get
Dev set
MR=156.61, MRR= 0.375, Hits1 = 0.2815, Hits10= 0.5628 which is better than reported.
BTW, do I understand it correctly that filtered_with_test
metric considers the test set for filtering in addition to dev and test set? Also, I=is there any equivalent of filtered_with_test
metric for the test set evaluation?
Yes, the reported results are averages.
Yes, evaluation in each set filters out the examples in train/dev and itself by default, so evaluating in the test set is also filtered_with_test
just not reported under this name.
You can check eval.filter_splits
and its explanation here.
Cool, thanks for the clarification!
Thanks for releasing the code for your work. I am trying to reproduce the numbers for the no context version of the model. Using the config file
trmeh-fb15k237-noctx.yaml
, I am getting the following metricsDev set MR=167.69, MRR= 0.327, Hits1 = 0.2308, Hits10= 0.520
Test set MR=170.79, MRR= 0.369, Hits1 = 0.2749, Hits10= 0.5590
However, in Table 3, the reported numbers are MRR=0.373, Hits@10= 0.561 (Dev set)
Kindly explain how to reproduce the results. Thanks!