evanmiltenburg / Shared-task-on-NLG-Evaluation

Repository to organize a shared task on NLG evaluation
6 stars 0 forks source link

Shared-task-on-NLG-Evaluation

Repository to organize a shared task on NLG evaluation

Description

We propose a shared task on the evaluation of data-to-text NLG systems, focusing on a common task, e.g. E2E or WebNLG. The shared task asks participants to evaluate system output, and produce (1) scores for each output, (2) a ranking of the different systems. There will be a separate track for qualitative studies. At the event, we discuss the results of the shared task, and afterwards we collaborate with interested parties on a journal paper based on the shared task data.

We encourage original contributions, for example:

The shared task data consists of three kinds of ‘output’ texts:

We will run all standard evaluation metrics on the shared task data. These scores will be made public, along with the shared task data. The different categories (system, human, synthetic) will be made public at a later date, but preceding the workshop.

Goals

Timeline

Organizing committee

We're still looking for volunteers. Use issue #1 to express your interest in joining the effort to make this event happen.