allenai / ir_datasets

Provides a common interface to many IR ranking datasets.
https://ir-datasets.com/
Apache License 2.0
318 stars 42 forks source link

docs_iter - docsctore progress doesnt use dataset size #85

Closed cmacdonald closed 3 years ago

cmacdonald commented 3 years ago

Is your feature request related to a problem? Please describe.

[INFO] [starting] building docstore
docs_iter: 1384223it [07:08, 3229.98it/s]

Describe the solution you'd like The dataset size is known, can you use that to make a more useful tqdm?

Additional context msmarco-document-v2