Closed scarlettllc closed 3 years ago
I've just found that use eclipse can also reproduce the results for QALD-9, but there is the fact that the results in eclipse is not so robust. I've been dedicated to finding the reason.
If a query to Virtuoso results in a network exception, this is normal.
When I run the EDGQA system in the IDEA, I can reproduce the results in the paper, and even get a little better results, both LC-QuAD and QALD-9. However, when I try to run the EDGQA system in the eclipse, I can't get such good results. I have run the test for QALD-9 for two times in eclipse, and the results are as follows:
[INFO] Cumulative metrics, sample: 150, P: 0.277, R: 0.367, macro F1: 0.286, macro F1*: 0.316
[INFO] QALD Cumulative metrics, sample: 150, P: 0.557, R: 0.367, macro F1: 0.286, QALD macro F1: 0.442
and[INFO] Cumulative metrics, sample: 150, P: 0.298, R: 0.387, macro F1: 0.306, macro F1*: 0.337
[INFO] QALD Cumulative metrics, sample: 150, P: 0.591, R: 0.387, macro F1: 0.306, QALD macro F1: 0.468
And the test results for QALD-9 in IDEA are as follows:[INFO] Cumulative metrics, sample: 150, P: 0.319, R: 0.409, macro F1: 0.326, macro F1*: 0.359
[INFO] QALD Cumulative metrics, sample: 150, P: 0.546, R: 0.409, macro F1: 0.326, QALD macro F1: 0.468
I've applied the same settings in both eclipse and IDEA. And As far as I'm concerned, the results for QALD-9 is ought to be independent of the IDE I use. I wonder why the above difference between eclipse and IDEA exists. On the other hand, I've been running experiments of LC-quad datasets in eclipse to further explore the doubts above.