[ ] Fix averageResults.html layout, Currently all metrics from all tasks are aggregated together. That approach was valid when, all the tasks have same subset of metrics (classification). Now if we have other tasks like IR (information retrieval) and QA (Question Answering) is no longer valid.
Proposal to Fix:
TABLE 1: Classification (averaged results from classification)
TABLE 2 IR (averaged results from information retrieval)
TABLE 3: QA (averaged results from question answering)
[ ] The other layout which is included in tasks pages, currently ignores both Question Answering and Information Retrieval tasks. This was intended, as for these mentioned tasks there was blank spaces instead of values.
Other issues:
[ ] Fix tables sorting in results. Currently default table sorting method is f1-micro that should be set for each task type separately. Remove default sorting or propose configuration for task's default metric
[ ] Add self-reported column that indicates wheter full pipeline was used or not. That also incoparates adding value to submissions.
Fix averageResults.html layout
[ ] Fix averageResults.html layout, Currently all metrics from all tasks are aggregated together. That approach was valid when, all the tasks have same subset of metrics (classification). Now if we have other tasks like IR (information retrieval) and QA (Question Answering) is no longer valid.
Proposal to Fix:
webpage/content/_index.md should include
{{< averageResults >}}
line at the endOther issues:
f1-micro
that should be set for each task type separately. Remove default sorting or propose configuration for task's default metric