allenai / ir_datasets

Provides a common interface to many IR ranking datasets.
https://ir-datasets.com/
Apache License 2.0
314 stars 42 forks source link

Qrel definitions for multiple fields #136

Closed janheinrichmerker closed 1 year ago

janheinrichmerker commented 2 years ago

Is your feature request related to a problem? Please describe. When query document pairs have multiple labels associated with them in their qrels, e.g., relevance and quality, only the relevance labels can be documented with qrels definitions (BaseQrels.qrels_defs()).

Describe the solution you'd like I'd like to document qrels definitions for both relevance and quality, as the labels and descriptions are different.

Describe alternatives you've considered I considered adding separate datasets, one for relevance qrels and one for quality qrels, but that is discouraged according to https://github.com/allenai/ir_datasets/pull/135#issuecomment-976658275.

Additional context none

seanmacavaney commented 2 years ago

What do you suppose is the best way to expose these alternate definitions? Maybe:

dataset.qrels_defs(field="relevance") # default to relevance field, but can provide an alternate field name here too

I've also thought about moving these definitions to the documentation yaml file (but still exposing them via .qrels_defs). Does that sound reasonable to you?

janheinrichmerker commented 2 years ago

I agree it would be nice to have them in the YAML 👍 But having an optional argument in qrels_defs is a nice addition too

janheinrichmerker commented 1 year ago

Closing due to inactivity.