Touché 2020 and 2021 - Githubissues

janheinrichmerker commented 2 years ago

Finally had time to implement the Touché datasets :partying_face: I'd suggest a different naming scheme, as we now already have the corpora in ir_datasets that were used in Touché 2020 and 2021:

clueweb12
argsme/1.0
argsme/2020-04-01 I simply link to the existing datasets then.

Thus I'd think it's unnecessary to start the "paths" with a corpus version and instead start with the year component like this:

touche/2020
touche/2021

Then I'd like to specify the shared tasks like this:

touche/2020/task-1
touche/2020/task-2
touche/2021/task-1
touche/2021/task-2

Now some tasks have multiple qrels, so I'd then further split like this:

touche/2020/task-1 (New corrected qrels used in the official evaluation.)
touche/2020/task-1/argsme-1.0-uncorrected and touche/2020/task-1/argsme-2020-04-01-uncorrected (Old crowd-sourced qrels.)
touche/2021/task-1/relevance (general topic relevance)
touche/2021/task-1/quality (argument quality)
touche/2021/task-2/relevance (general topic relevance)
touche/2021/task-2/quality (argument quality)

All different versions are also explained in the YAML documentation and some have specific qrel definitions in the dataset definition.

(BTW this PR fixes #125 :wink:)

janheinrichmerker commented 2 years ago

I wonder if you could maybe publish a release with the newly added datasets and the many other changes you did since the last release?

seanmacavaney commented 2 years ago

Wow- you're right, it has been a bit.

Is it important that these changes are included in the release? I likely don't have the time to review these changes today, but I can pretty easily bump versions and release what's in the main branch now.

janheinrichmerker commented 2 years ago

I'd then rather wait and release Touché together with args.me :smiley:

janheinrichmerker commented 2 years ago

It's not urgent though.

seanmacavaney commented 2 years ago

Let me see if I can squeeze in a review of this today

seanmacavaney commented 2 years ago

Thanks again for the great work @heinrichreimer! After a quick scan, I propose the following changes:

The datasets are usually organised hierarchically with the corpus at the upper level. E.g., there are versions of TREC Web, TREC Health Misinformation, NTCIR WWW, and CLEF eHealth all under clueweb12. I think we should do the same thing here. I propose:

touche -> (remove)
touche/2020 -> (remove)
touche/2020/task-1 -> argsme/1.0/touche-2020-task1
touche/2020/task-1/argsme-1.0-uncorrected -> argsme/1.0/touche-2020-task1/uncorrected
touche/2020/task-1/argsme-2020-04-01-uncorrected -> argsme/2020-04-01/touche-2020-task1/uncorrected
touche/2020/task-2 -> clueweb12/touche-2020-task2
touche/2021 -> (remove)
touche/2021/task-1 -> argsme/2020-04-01/touche-2021-task1
touche/2021/task-1/quality -> (merge with above, see note below)
touche/2021/task-1/relevance -> (merge with above, see note below)
touche/2021/task-2 -> clueweb12/touche-2021-task2
touche/2021/task-2/quality -> (merge with above, see note below)
touche/2021/task-2/relevance -> (merge with above, see note below)

It looks like we can merge both the quality and relevance assessments into the same qrel records, since it's the same query-doc pairs being judged. clueweb12/b13/trec-misinfo-2019 is an example of a dataset that does something similar.

janheinrichmerker commented 2 years ago

Very good suggestions! I'll move the datasets and merge the quality and relevance qrels.

janheinrichmerker commented 2 years ago

One thing to keep in mind is that in Touché 2022 (which I'm planning to add as well once qrels are available) uses derived corpora from args.me as well as custom corpora. For the extracted passage corpora derived from argsme-2020-04-01 I think we could then have custom documents in the argsme-2020-04-01/touche-2022-task-1 dataset ID, for example. (See task 1 description.) For 2022 task 2 would I then need to add the dataset with the ID webis/touche-2022-task-2? (See task 2 description.) What other "path structure" would be ideal here? (As adding Touché 2022 is not directly related to this PR, we might also discuss this in another issue.)

seanmacavaney commented 2 years ago

Thanks for the heads up!

For `22 Task 1, I'd lean towards:

argsme/2020-04-01/sents (sentence-segmented version of corpus)
argsme/2020-04-01/sents/touche-2022-task-1 (above + queries + qrels)
(or to keep the ID from getting too long, maybe argsme/2020-04-01-s prefix, where -s indicates sentence segmentation?)

For '22 Task 2, do we know about the corpus? It looks like they are passages derived from a subset of clueweb12. Do you think it'll be used beyond the '22 task? I think good options would be along the lines of:

clueweb12/[some-descriptive-subset-name]/touche-2022-task-2 if derived from clueweb12.
[some-desciptive-corpus-name]/touche-2022-task-2 if not from clueweb12 and the corpus is likely to be used in future years.
touche-2022-task-2 if not from clueweb12 and if this corpus is likely only going to be used this year (akin to trec-robut04)

But there's obviously no "right" answer for any of these -- I'm open to alternatives.

Some of these IDs are getting pretty long. I wonder if we should shorten touche-YYYY-task-X to toucheYY-taskX (e.g., touche22-task1) to keep them (slightly) more manageable?

BTW- it can be helpful to add the docs and queries for '22 now, even before the qrels are released, to help folks participate in the task. Qrels can be added once they are released.

janheinrichmerker commented 2 years ago

There we go :smile:

seanmacavaney commented 2 years ago

Awesome work, as always. Thanks, @heinrichreimer!

janheinrichmerker commented 2 years ago

Thanks for the release!

allenai / ir_datasets

Touché 2020 and 2021 #135