deepset-ai / haystack-core-integrations

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards
https://haystack.deepset.ai
Apache License 2.0
121 stars 119 forks source link

Chroma document store cannot connect to remote instance #1197

Open dheerapat opened 6 days ago

dheerapat commented 6 days ago

Describe the bug A clear and concise description of what the bug is.

document_store = ChromaDocumentStore(host="http://localhost", port=8000)

when use these option to connect to remote instance there will be result in error in chromadb package

Traceback (most recent call last):
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
    yield
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 236, in handle_request
    resp = self._pool.handle_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
    raise exc from None
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
    response = connection.handle_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
    raise exc
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 78, in handle_request
    stream = self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 207, in connect_tcp
    with map_exceptions(exc_map):
  File "/home/dheeto/.pyenv/versions/3.11.9/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/chromadb/api/client.py", line 101, in get_user_identity
    return self._server.get_user_identity()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 144, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/chromadb/api/fastapi.py", line 144, in get_user_identity
    return UserIdentity(**self._make_request("get", "/auth/identity"))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/chromadb/api/fastapi.py", line 89, in _make_request
    response = self._session.request(method, url, **cast(Any, kwargs))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_client.py", line 837, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_client.py", line 926, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_client.py", line 954, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_client.py", line 991, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_client.py", line 1027, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 235, in handle_request
    with map_httpcore_exceptions():
  File "/home/dheeto/.pyenv/versions/3.11.9/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dheeto/Desktop/haystack-local-demo/doc_store.py", line 5, in <module>
    document_store.write_documents(
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/haystack_integrations/document_stores/chroma/document_store.py", line 241, in write_documents
    self._ensure_initialized()
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/haystack_integrations/document_stores/chroma/document_store.py", line 104, in _ensure_initialized
    client = chromadb.HttpClient(
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/chromadb/__init__.py", line 204, in HttpClient
    return ClientCreator(tenant=tenant, database=database, settings=settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/chromadb/api/client.py", line 65, in __init__
    user_identity = self.get_user_identity()
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dheeto/Desktop/haystack-local-demo/venv/lib/python3.11/site-packages/chromadb/api/client.py", line 103, in get_user_identity
    raise ValueError(
ValueError: Could not connect to a Chroma server. Are you sure it is running?

To Reproduce Steps to reproduce the behavior. Feel free to link a Colab we can run to investigate the issue. run this snippet to produce error, make sure you running chroma instance in docker

from haystack_integrations.document_stores.chroma import ChromaDocumentStore
from haystack import Document

document_store = ChromaDocumentStore(host="http://localhost", port=8000)
document_store.write_documents(
    [
        Document(content="This is the first document."),
        Document(content="This is the second document."),
    ]
)
print(document_store.count_documents())

Describe your environment (please complete the following information):

anakin87 commented 1 day ago

It is working for me.

I first started the Chroma server (see docs) chroma run

then I executed this code

from haystack_integrations.document_stores.chroma import ChromaDocumentStore
from haystack import Document

document_store = ChromaDocumentStore(host="localhost", port=8000)  # note the host value
document_store.write_documents(
    [
        Document(content="This is the first document."),
        Document(content="This is the second document."),
    ]
)
print(document_store.count_documents())