allenai / ir_datasets

Provides a common interface to many IR ranking datasets.
https://ir-datasets.com/
Apache License 2.0
306 stars 40 forks source link

TREC 2023 Tip-of-the-Tongue #235

Open mam10eks opened 1 year ago

mam10eks commented 1 year ago

Dataset Information:

The training and dev data of the TREC 2023 Tip-of-the-Tongue track are now available: https://trec-tot.github.io/guidelines

Description from the website:

Tip of the tongue: The phenomenon of failing to retrieve something from memory, combined with partial recall and the feeling that retrieval is imminent
In terms of input and output, the movie identification task is relatively straightforward—given an input TOT request, output a ranked list of movies. Each movie must be identified by its Wikipedia page id and the correct movie should be ranked as high as possible. For each query, runs should return a ranked list of 1000 Wikipedia page ids. Runs will be evaluated using IR metrics that are appropriate for IR tasks with one relevant document, such as discounted cumulative gain, reciprocal rank, and success@k.

Dataset ID(s) & supported entities:

Checklist

Mark each task once completed. All should be checked prior to merging a new dataset.

Additional comments/concerns/ideas/etc.

mam10eks commented 1 year ago

I would like to implement this ticket.

mam10eks commented 1 year ago

cc @samarthbhargav

mam10eks commented 1 year ago

Dear all, I now had the time to implement this in this branch: https://github.com/mam10eks/ir_datasets/tree/trec-tip-of-the-tongue

Basically, everything is resolved, but I forgot how to do these two steps:

Otherwise, everything seems to be ready.

@seanmacavaney I forgot, was there some documentation on how to do those two steps?