allenai / ir_datasets

Provides a common interface to many IR ranking datasets.
https://ir-datasets.com/
Apache License 2.0
314 stars 42 forks source link

Duplicate download definitions #130

Closed janheinrichmerker closed 2 years ago

janheinrichmerker commented 2 years ago

Describe the bug Not really an urgend bug as the package runs just fine, but the download definitions file contains duplicates.

Affected dataset(s)

To Reproduce Steps to reproduce the behavior:

  1. Open ir_datasets/etc/downloads.json
  2. Go to line 1628
  3. See duplicate keys:
    • es/docs
    • es/queries/dev
    • es/queries/train

Expected behavior Each dataset should be specified exactly once in ir_datasets/etc/downloads.json.

Additional context Add any other context about the problem here.

seanmacavaney commented 2 years ago

Thanks for reporting! I'll investigate.