Closed alecrimi closed 2 months ago
Does your application require farm-haystack as opposed to 2.0 haystack (haystack-ai)? The chromaDB documentation makes it seem like it only supports 2.0+.
Hello!
farm-haystack
(Haystack 1.x) is in maintenance mode and will be discontinued in the future. It does not support Chroma.
haystack-ai
(Haystack 2.x) is the library we are currently developing. It supports Chroma.
As you noticed, farm-haystack
and haystack-ai
are not compatible (see https://github.com/deepset-ai/haystack/discussions/6684#discussioncomment-8022284).
If you need guidance on migrating from 1.x to 2.x, please take a look at the Migration guide.
Does your application require farm-haystack as opposed to 2.0 haystack (haystack-ai)? The chromaDB documentation makes it seem like it only supports 2.0+.
No, it doesn't. But I need something more than just the integration given by chroma-haystack package. If I understand you well. Instead of installing farm-haystack, I should simply install first chroma-haystack and then haystack 2.0? This has never been clarified in your Chroma-Haystack guide. Is Haystack 2.0 supported by Python 3.,9? I have to double check but I remember pip was installing haystack 1.
chroma-haystack
automatically installs haystack-ai
(2.x).
See https://github.com/deepset-ai/haystack-core-integrations/blob/6b07663962967a9308516753236a1642140a59c3/integrations/chroma/pyproject.toml#L25
I am seriously confused. In a brand new environment with just chailit and chroma-haystack, I have problems with the embeddingretriever, that's why I ended up installing farm-haystack and removing haystack-ai. I assume in the philosophy haystack 2. The code is different. Can you point me out what should I use instead of the following code inside the embedding model? Do you have an example how to use BM25Retriever? I found only the description in the migration manual.
This is my current (wrong) code: `import os import pdfplumber from haystack_integrations.document_stores.chroma import ChromaDocumentStore
from haystack.nodes import EmbeddingRetriever from haystack.schema import Document from sentence_transformers import SentenceTransformer import chromadb
HF_TOKEN = os.getenv("HF_TOKEN")
chroma_client = chromadb.Client() # Initialize ChromaDB client document_store = ChromaDocumentStore(client=chroma_client, embedding_dim=384)
embedding_model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") retriever = EmbeddingRetriever(document_store=document_store, embedding_model=embedding_model) ... `
I found this, it should solve the confusion: https://docs.haystack.deepset.ai/v2.0/docs/chromaqueryretriever
@alecrimi in addition to the other links in this issue, I can recommend having a look at the following python notebook: https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/chroma-indexing-and-rag-examples.ipynb It explains how to install the required dependency, how to index data, and how to query the data. An overview of the integration can be found here: https://haystack.deepset.ai/integrations/chroma-documentstore Please don't hesitate to reach out again if you have more questions!
Hi, I am hitting conflicting dependencies using haystack-chroma. For some libraries I cannot use anything younger than Python 3.9, and I have avoided Python 3.10 as I have seen some stuff I use it is not yet ready. So, the issues are on Python 3.9. I think the main problem is the need to install farm-haystack[inference] & chroma-haystack
As chroma-haystack pulls also haystack-ai which is iconpatible with farm-haystack. In Conda env I have chroma-haystack 0.21.1 farm-haystack 1.26.2 ChromaDocumentStore Haystack EmbeddingRetriever.
However, it get stuck already at the import of EmbeddingRetriever in the code
`import os import pdfplumber import chainlit as cl import requests from haystack_integrations.document_stores.chroma import ChromaDocumentStore
from haystack.nodes import EmbeddingRetriever from haystack.schema import Document from chromadb import Client as ChromaClient # Import the ChromaDB client
HF_TOKEN = os.getenv("HF_TOKEN")
API_URL = "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.1"
chroma_client = ChromaClient() # Initialize ChromaDB client document_store = ChromaDocumentStore(client=chroma_client, embedding_dim=384) retriever = EmbeddingRetriever(document_store=document_store, embedding_model="sentence-transformers/all-MiniLM-L6-v2") ....`
as I get the following error:
2024-08-23 20:30:13 - Auto-enabled tracing for 'OpenTelemetryTracer' Traceback (most recent call last): File "/home/bam/anaconda3/envs/haystack_chroma/bin/chainlit", line 8, in <module> sys.exit(cli()) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/chainlit/cli/__init__.py", line 201, in chainlit_run run_chainlit(target) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/chainlit/cli/__init__.py", line 66, in run_chainlit load_module(config.run.module_name) File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/chainlit/config.py", line 419, in load_module spec.loader.exec_module(module) File "<frozen importlib._bootstrap_external>", line 850, in exec_module File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed File "chromatest.py", line 7, in <module> from haystack.nodes import EmbeddingRetriever File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/haystack/nodes/__init__.py", line 1, in <module> from haystack.nodes.base import BaseComponent File "/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/haystack/nodes/base.py", line 11, in <module> from haystack.errors import PipelineSchemaError ImportError: cannot import name 'PipelineSchemaError' from 'haystack.errors' (/home/bam/anaconda3/envs/haystack_chroma/lib/python3.9/site-packages/haystack/errors.py)
removing haystack-ai improves a bit, but then I hit other bugs related to the functions of telemetry and send_message()