terrier-org / pyterrier

A Python framework for performing information retrieval experiments, building on http://terrier.org/
https://pyterrier.readthedocs.io/
Mozilla Public License 2.0
397 stars 63 forks source link

added corpus_iter for Terrier index #426

Closed cmacdonald closed 4 months ago

cmacdonald commented 5 months ago

Initial draft of https://github.com/terrier-org/pyterrier/issues/425

Unit tests are needed.

@seanmacavaney is the API useful?

cmacdonald commented 5 months ago

direct isn't the most intuitive name to me for this field.

Final decision: toks or return_toks ?

seanmacavaney commented 5 months ago

I don't think I feel super strongly one way or the other, but I feel return_toks is a bit clearer (i.e., it sounds like a boolean).

cmacdonald commented 5 months ago

Ok, ta. I write some unit tests then I merge.

cmacdonald commented 4 months ago

My revised implementation requires Python 3.8. Python 3.7 is EOL (July 2023). Should enforce an upgrade?

Colab is Py 3.10 now

seanmacavaney commented 4 months ago

I'm happy with min python version of 3.8