Closed ali-abz closed 3 years ago
Hi Ali, currently is_large_collection
should always be False. When True, it creates a lean Anserini index (no raw text, term positions, etc), but you will run into trouble when retrieving raw text from the index fails.
I see, thanks.
I wonder why collections like nf
or antique
do not set it to True since is_large_collection
by definition, is set to False.
Sorry, I got that backwards (and edited above). Collections should not be considered large, so they should all set is_large_collection=False
.
Thanks.
Hi there, I am trying to create a new collection of Persian Wikipedia and it contains about 1.4 million paragraphs which I am willing to index and use. I did not found any documentation regarding when to set
is_large_collection
to True or what it does.I would appreciate any comments.