castorini / pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
http://pyserini.io/
Apache License 2.0
1.67k stars 370 forks source link

no attribute 'TREC2023_DL #1822

Closed kisozinov closed 6 months ago

kisozinov commented 7 months ago

Hi, i've installed jar https://repo1.maven.org/maven2/io/anserini/anserini/0.24.2/anserini-0.24.2-fatjar.jar and moved it to pyserini.

After successful indexing my custom corpus i've tried to launch (like in guide https://github.com/castorini/pyserini/blob/master/docs/usage-index.md#building-a-bm25-index-embeddable-python-implementation)

python -m pyserini.search.lucene  \
--index .../bm25/indices \
--topics .../queries_7b.tsv\
--output run.bm25_7b.txt \
--bm25

I've got an error: AttributeError: type object 'io.anserini.search.topicreader.Topics' has no attribute 'TREC2023_DL'. Did you mean: 'TREC2020_DL'?

simple launch like this works fine:

from pyserini.search.lucene import LuceneSearcher

searcher = LuceneSearcher('indexes/sample_collection_jsonl')
hits = searcher.search('document')

for i in range(len(hits)):
    print(f'{i+1:2} {hits[i].docid:4} {hits[i].score:.5f}')
justram commented 6 months ago

Should be fine now

lintool commented 6 months ago

Hi @kisozinov - I just pushed out an update release. Issue should be fixed. Closing issue, but reopen if needed.