Closed odashi closed 1 year ago
@neubig @pfliu-nlp The change looks almost working, but there are several errors due to value mismatch:
======================================================================
FAIL: test_extractive_qa_en (integration_tests.extractive_qa_test.ExtractiveQATest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "~/ExplainaBoard/integration_tests/extractive_qa_test.py", line 48, in test_extractive_qa_en
self.assertAlmostEqual(overall["ExactMatch"].value, 0.6974789915966386, 2)
AssertionError: 0.6571428571428571 != 0.6974789915966386 within 2 places (0.04033613445378148 difference)
======================================================================
FAIL: test_extractive_qa_zh (integration_tests.extractive_qa_test.ExtractiveQATest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "~/ExplainaBoard/integration_tests/extractive_qa_test.py", line 79, in test_extractive_qa_zh
self.assertAlmostEqual(overall["F1"].value, 0.7559651817716333, 2)
AssertionError: 0.6857142857142857 != 0.7559651817716333 within 2 places (0.07025089605734758 difference)
======================================================================
FAIL: test_qa_metrics (integration_tests.metric_test.MetricTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "~/ExplainaBoard/integration_tests/metric_test.py", line 147, in test_qa_metrics
self.assertAlmostEqual(overall["ExactMatch"].value, 0.6974789915966386, 2)
AssertionError: 0.6571428571428571 != 0.6974789915966386 within 2 places (0.04033613445378148 difference)
I didn't understand the source of these errors. It would be nice if you can take a look at them.
I found some issues, fixing it
This change attempts to change all
list[Metric***]
structures intodict[str, Metric***]
, whose key is the metric name.