Closed kubami closed 2 years ago
A different problem happens when installing with Pipenv. It seems to be going in a loop trying to resolve dependencies, the by product is it uses infinite amount of disk space :). This is probably a bug in Pipenv.
Hi @kubami I'm trying to reproduce the issue but I get a slightly different error.
Can you tell me your:
Poetry: 1.1.13 Python: 3.8.12 (I am using pyenv in conjunction with poetry)
Haystack checkout: 0395533a786cc63bf2f5180ee7d3dc3eefebdd59
Pipenv: 2022.5.2
But this is not constrained to those versions.
I have tried installing haystack with poetry for couple of months... Always gave up and just reverted to use pip
.
Which is a pain, cause all our tooling/workflows is made with poetry.
Thank you for taking a look at this. What error are you getting? Did you specify the extras with poetry? The errors change with different extras.
Hi @kubami , I was able to install Haystack successfully with Poetry using poetry add git+https://github.com/deepset-ai/haystack.git#master
. This seems to have installed the full feature set that Haystack offers. My environment config is as follows:
Poetry: 1.1.13 Pyenv: 2.2.5 Python: 3.9.1 Haystack: 1.5.1rc0
Poetry config: virtualenvs.create = true virtualenvs.in-project = true virtualenvs.path = ".venv"
To check that it was working as intended, I was able to launch each variation of the document stores provided.
@kmcleste thanks for checking this out.
I can confirm I was able to install haystack with poetry without specifying any extras.
poetry add ./vendors/haystack
(where the latest haystack is checked out). This has worked both for haystack 1.4.0 and 1.5.0rc0.
It seems the problem is with specifying extras
Thanks all for contributing to the issue, I'll keep it open as I want it to work with extras
too, but I'm glad you have a workaround for now.
I've just tried to install the latest haystack version through poetry. The installation went through without any issue. But when importing anything from haystack I'm facing a huggingface-hub
error. My guess would be, that the latest release from hugging face is broken, but the packages is pulled as it is the latest version.
$ python test.py
INFO - haystack.document_stores.base - Numba not found, replacing njit() with no-op implementation. Enable it with 'pip install numba'.
/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/huggingface_hub/snapshot_download.py:6: FutureWarning: snapshot_download.py has been made private and will no longer be available from version 0.11. Please use `from huggingface_hub import snapshot_download` to import the only public function in this module. Other members of the file may be changed without a deprecation notice.
warnings.warn(
Traceback (most recent call last):
File "/home/florian/test/test-haystack-and-huggingface/test.py", line 1, in <module>
from haystack import Pipeline
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/haystack/__init__.py", line 26, in <module>
from haystack.nodes.base import BaseComponent
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/haystack/nodes/__init__.py", line 5, in <module>
from haystack.nodes.answer_generator import BaseGenerator, RAGenerator, Seq2SeqGenerator
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/haystack/nodes/answer_generator/__init__.py", line 2, in <module>
from haystack.nodes.answer_generator.transformers import RAGenerator, Seq2SeqGenerator
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/haystack/nodes/answer_generator/transformers.py", line 18, in <module>
from haystack.nodes.retriever.dense import DensePassageRetriever
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/haystack/nodes/retriever/__init__.py", line 2, in <module>
from haystack.nodes.retriever.dense import DensePassageRetriever, EmbeddingRetriever, TableTextRetriever
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/haystack/nodes/retriever/dense.py", line 22, in <module>
from haystack.nodes.retriever._embedding_encoder import _EMBEDDING_ENCODERS
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/haystack/nodes/retriever/_embedding_encoder.py", line 8, in <module>
from sentence_transformers import InputExample, losses
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/sentence_transformers/__init__.py", line 3, in <module>
from .datasets import SentencesDataset, ParallelSentencesDataset
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/sentence_transformers/datasets/__init__.py", line 3, in <module>
from .ParallelSentencesDataset import ParallelSentencesDataset
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/sentence_transformers/datasets/ParallelSentencesDataset.py", line 4, in <module>
from .. import SentenceTransformer
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 25, in <module>
from .evaluation import SentenceEvaluator
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/sentence_transformers/evaluation/__init__.py", line 5, in <module>
from .InformationRetrievalEvaluator import InformationRetrievalEvaluator
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/sentence_transformers/evaluation/InformationRetrievalEvaluator.py", line 6, in <module>
from ..util import cos_sim, dot_score
File "/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/sentence_transformers/util.py", line 407, in <module>
from huggingface_hub.snapshot_download import REPO_ID_SEPARATOR
ImportError: cannot import name 'REPO_ID_SEPARATOR' from 'huggingface_hub.snapshot_download' (/home/florian/.cache/pypoetry/virtualenvs/test-haystack-and-huggingface-3I8GGk3v-py3.9/lib/python3.9/site-packages/huggingface_hub/snapshot_download.py)
Poetry installation:
$ poetry add farm-haystack
Creating virtualenv test-haystack-and-huggingface-3I8GGk3v-py3.9 in /home/florian/.cache/pypoetry/virtualenvs
Using version ^1.5.0 for farm-haystack
Updating dependencies
Resolving dependencies... (21.5s)
Writing lock file
Package operations: 91 installs, 0 updates, 0 removals
• Installing certifi (2022.6.15)
• Installing charset-normalizer (2.0.12)
• Installing idna (3.3)
• Installing markupsafe (2.1.1)
• Installing pyparsing (3.0.9)
• Installing urllib3 (1.26.9)
• Installing zipp (3.8.0)
• Installing click (8.1.3)
• Installing filelock (3.7.1)
• Installing greenlet (1.1.2)
• Installing importlib-metadata (4.11.4)
• Installing itsdangerous (2.1.2)
• Installing jinja2 (3.1.2)
• Installing numpy (1.22.4)
• Installing oauthlib (3.2.0)
• Installing packaging (21.3)
• Installing pyyaml (6.0)
• Installing requests (2.28.0)
• Installing six (1.16.0)
• Installing smmap (5.0.0)
• Installing tqdm (4.64.0)
• Installing typing-extensions (4.2.0)
• Installing werkzeug (2.1.2)
• Installing docopt (0.6.2)
• Installing flask (2.1.2)
• Installing gitdb (4.0.9)
• Installing huggingface-hub (0.8.0)
• Installing isodate (0.6.1)
• Installing joblib (1.1.0)
• Installing mako (1.2.0)
• Installing pillow (9.1.1)
• Installing prometheus-client (0.14.1)
• Installing pyjwt (2.4.0)
• Installing python-dateutil (2.8.2)
• Installing pytz (2022.1)
• Installing regex (2022.6.2)
• Installing requests-oauthlib (1.3.1)
• Installing scipy (1.6.1)
• Installing sqlalchemy (1.4.37)
• Installing tabulate (0.8.9)
• Installing threadpoolctl (3.1.0)
• Installing tokenizers (0.12.1)
• Installing torch (1.11.0)
• Installing websocket-client (1.3.2)
• Installing alembic (1.8.0)
• Installing attrs (21.4.0)
• Installing azure-common (1.1.28)
• Installing azure-core (1.22.1)
• Installing backoff (1.11.1)
• Installing cloudpickle (2.1.0)
• Installing databricks-cli (0.16.8)
• Installing docker (5.0.3)
• Installing entrypoints (0.4)
• Installing gitpython (3.1.27)
• Installing gunicorn (20.1.0)
• Installing inflect (5.6.0)
• Installing jarowinkler (1.0.2)
• Installing lxml (4.9.0)
• Installing monotonic (1.6)
• Installing msrest (0.6.21)
• Installing nltk (3.7)
• Installing num2words (0.5.10)
• Installing pandas (1.4.2)
• Installing prometheus-flask-exporter (0.20.2)
• Installing protobuf (4.21.1)
• Installing scikit-learn (1.1.1)
• Installing querystring-parser (1.2.4)
• Installing pyrsistent (0.18.1)
• Installing sentencepiece (0.1.96)
• Installing sqlparse (0.4.2)
• Installing torchvision (0.12.0)
• Installing transformers (4.19.2)
• Installing azure-ai-formrecognizer (3.2.0b2)
• Installing dill (0.3.5.1)
• Installing elastic-apm (6.9.1)
• Installing elasticsearch (7.10.0)
• Installing jsonschema (4.6.0)
• Installing langdetect (1.0.9)
• Installing mlflow (1.26.1)
• Installing mmh3 (3.0.0)
• Installing more-itertools (8.13.0)
• Installing networkx (2.8.4)
• Installing posthog (1.4.9)
• Installing pydantic (1.9.1)
• Installing python-docx (0.8.11)
• Installing quantulum3 (0.7.10)
• Installing rapidfuzz (2.0.11)
• Installing sentence-transformers (2.2.0)
• Installing seqeval (1.2.2)
• Installing tika (1.24)
• Installing farm-haystack (1.5.0)
Downgrading huggingface-hub
manually to 0.7.0
fixed that issue.
It seems to be going in a loop trying to resolve dependencies, the by product is it uses infinite amount of disk space :). This is probably a bug in Pipenv.
I like using PDM and it does this as well when installing extras ([all-gpu]
in my case) - it takes HOURS to resolve all of the dependencies. In fact, I'm not sure I've ever successfully completed it.
If you use verbose mode, you can see that it is seemingly looping through and checking each version of each dependency against each other. This happens, at least in part, because many of the haystack dependencies do not have any minimum version listed.
So, I removed any extraneous optional dependencies and then started modifying the dependency versions to limit them to the versions that were released in the past 2 years, and it speeds the resolving up considerably, but it was a major nuisance and I never fully succeeded.
So, its hard to say whether the issue is the un-limited dependencies, or if it is something inherent to these dependency managers - they cross-check for good reason, but maybe its not practical for certain packages?
Hi, I wanted to install haystack with poetry:
however I get an error:
The latest haystack is checkout in
/vendors/haystack/
. When installing with pip, as stated in the docs everything works.