Closed shashankg7 closed 3 years ago
Ah, this is a newly-introduced feature than hasn't been published in the PyPI package yet... so you'll need to get a dev installation: https://github.com/castorini/pyserini/#development-installation
Thank you @lintool for your response.
I was able to get it run from the development version.
I have a follow-up question. I am trying to index a Wikipedia corpus and search on it using dense retrieval.
I tried the DPR encoder, but it's not giving good results. I think this is due to the fact that DPR is trained in QA domain.
Any suggestion on what encoder I can use for Wikipedia?
The short answer is... we don't know. You're basically talking about an open research question...
Dense retrieval is known to be very corpus/query specific, often with poor zero-short effectiveness when transferred to another collection.
That's why we have benchmarks like https://github.com/UKPLab/beir to further explore...
Thanks, @lintool for your response and pointer to the IR benchmark. I'll look more into it.
Hi,
I have a Wikipedia related custom corpus, which I am trying to index using dense vectors.
But the command "python -m pyserini.dindex" is not working.
I am getting an error:
AttributeError: module 'pyserini' has no attribute 'dindex'
I have installed all of the dependencies, so not sure what is wrong.
Please let me know.