Closed ali-abz closed 3 years ago
Yes, it does look like this is an error with Anserini. I will try to reproduce it soon. In the meantime, could you also turn on debug logging and provide the output from Anserini? (this should be above the exception)
To turn on the logging, run this before importing capreolus for the first time:
import os
os.environ["CAPREOLUS_LOGGING"] = "DEBUG"
these are the logs before runtime error:
2021-02-18 16:59:46,145 - INFO - capreolus.index.anserini._create_index - building index /home/ali/.capreolus/cache/collection-nf/index-anserini_indexstops-False_stemmer-porter/index
2021-02-18 16:59:46,150 - DEBUG - capreolus.index.anserini._create_index - ['java', '-classpath', '/home/ali/miniconda3/lib/python3.8/site-packages/pyserini/resources/jars/anserini-0.9.3-fatjar.jar', '-Xms512M', '-Xmx31G', "-Dapp.name='IndexCollection'", 'io.anserini.index.IndexCollection', '-collection', 'TrecCollection', '-generator', 'DefaultLuceneDocumentGenerator', '-threads', '8', '-input', '/home/ali/.capreolus/cache/collection-nf/documents', '-index', PosixPath('/home/ali/.capreolus/cache/collection-nf/index-anserini_indexstops-False_stemmer-porter/index'), '-stemmer', 'porter', '-storePositions', '-storeDocvectors', '-storeContents']
Do I need to install anserini by myself? I am assuming that pyserini will take care of it, right?
That's right. You don't need to install anserini, but you do need to install java 11.
That could explain it since I was using java 8 (don't know why it happened on colab since it was using java 11). I cannot reproduce the error on my systems with java 11 since carpeolus can not be installed due to dependency conflicts. One of them (pytest) can be fixed by installing the correct version but I cannot resolve the other two.
$ pip install capreolus
...
ERROR: pytest-mock 3.5.1 has requirement pytest>=5.0, but you'll have pytest 3.6.4 which is incompatible.
ERROR: scispacy 0.4.0 has requirement spacy<3.1.0,>=3.0.0, but you'll have spacy 2.2.4 which is incompatible.
ERROR: pymagnitude is in an unsupported or invalid wheel
I read something like "as of 2021, pip resolving dependencies differently". Could this be it? And I think I should open up another issue for this new error, right?
There is something off with my system and I can't still use capreolus on my system. Yet, I cannot reproduce the error above so I think this issue should be closed. My other systems are fine and running capreolus with no problem.
Thanks for the update! If you later want to debug the system where it isn't working, my guess is that you have conflicts due to other Python packages installed. You could get around that with pipenv (or miniconda, or a regular virtualenv). Anserini failures in general are likely to be related to a missing Java 11.
I'm pretty new to neural IR so I might be doing it wrong but when I try to follow codes in https://capreolus.ai/en/latest/quick.html#command-line-interface and https://capreolus.ai/en/latest/quick.html#command-line-interface, I get an ambiguous error. There is something off with Anserini I believe.
I tried the code on 3 different systems (google colab, ubuntu's python and anaconda) and installed capreolus with pip.