Closed PreetJhanglani closed 9 months ago
Hi @PreetJhanglani
There was a related report here: https://github.com/terrier-org/pyterrier/issues/396
In that report, I identified a need to de-conflict the Snowball stemmers between Lucene and Terrier. As of today, that's a work in progress.
There might be some hints in that #396 as to workaround your problems.
Ok, I have this working in a branch of Pyterrier. Salient details are:
%pip install git+https://github.com/terrier-org/pyterrier.git@anserini22
%pip install pyserini==0.22.0 faiss-cpu
import pyterrier as pt
# use Anserini jar file that matches pyserini install
# version='snapshot' uses a jitpack version of current Terrier github, where snowball has been deconflicted from Lucene..
pt.init(boot_packages=["io.anserini:anserini:0.22.0:fatjar"], version='snapshot')
Example notebook at: https://colab.research.google.com/drive/1qzcO8O-cIh8aNtVmJ2Izgzni8h4UO4xV?usp=sharing
Hi @PreetJhanglani I merged the changes to the master branch. This will be included in the next PyTerrier release. I will mark this issue as addressed.
I tries using the example mentioned at https://pyterrier.readthedocs.io/en/latest/anserini.html#examples, I got the lucene index using pyserini but when I run the BM25_ai = pt.anserini.AnseriniBatchRetrieve(luceneIndex, wmodel="BM25") command it gives a error which ended with JavaException: JVM exception occurred: io/anserini/eval/Qrels java.lang.NoClassDefFoundError. I got the same error on colab and my mac m1.