allenai / ir_datasets

Provides a common interface to many IR ranking datasets.
https://ir-datasets.com/
Apache License 2.0
316 stars 42 forks source link

`ir_datasets` writes to `$HOME` which makes reproducibility hard #237

Open MangoIV opened 1 year ago

MangoIV commented 1 year ago

Describe the bug

Importing ir_datasets requires writing to $HOME. Perhaps this is common sense in python land but I have been slightly alienated by this. I expected it to install all their files at installation time and not at runtime because doing it at runtime will result in undefined behavior. Getting rid of this behavior is always hard because I end up patching the actual code when building it with nix such that it doesn't write to places that are not supposed to be written to.

Expected behavior

It would be nice if ir_datasets would not write to $HOME silently but perhaps already installs all their files during installation.

Additional context

Please excuse me initially quite harsh text, I wasn't intending to sound aggressive.