langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.9k stars 15.37k forks source link

Protobuf errors when using langchain-chroma with protobuf >= 4 #26745

Open paolomainardi opened 1 month ago

paolomainardi commented 1 month ago

Checked other resources

Example Code


import chromadb

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/src/ai/langchain/cli.py", line 3, in <module>
    from ai.langchain.chat import chat
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/src/ai/langchain/chat.py", line 14, in <module>
    from ai.langchain.loaders import create_store
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/src/ai/langchain/loaders.py", line 11, in <module>
    from ai.chroma import create_store_or_retrieve, delete_collections
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/src/ai/chroma/__init__.py", line 5, in <module>
    import chromadb
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/.venv/lib/python3.12/site-packages/chromadb/__init__.py", line 6, in <module>
    from chromadb.auth.token_authn import TokenTransportHeader
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/.venv/lib/python3.12/site-packages/chromadb/auth/token_authn/__init__.py", line 24, in <module>
    from chromadb.telemetry.opentelemetry import (
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/.venv/lib/python3.12/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 12, in <module>
    from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/.venv/lib/python3.12/site-packages/opentelemetry/exporter/otlp/proto/grpc/trace_exporter/__init__.py", line 22, in <module>
    from opentelemetry.exporter.otlp.proto.grpc.exporter import (
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/.venv/lib/python3.12/site-packages/opentelemetry/exporter/otlp/proto/grpc/exporter.py", line 39, in <module>
    from opentelemetry.proto.common.v1.common_pb2 import (
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/.venv/lib/python3.12/site-packages/opentelemetry/proto/common/v1/common_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/home/paolo/webapps/sparkfabrik/int/labs/ai/rag-platform/.venv/lib/python3.12/site-packages/google/protobuf/descriptor.py", line 553, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
make: *** [Makefile:29: data-reindex] Error 1

Description

Installed dependencies:

[tool.poetry.dependencies]
python = ">=3.10,<3.13"
slack-bolt = "^1.20.1"
python-dotenv = "^1.0.1"
unstructured = {extras = ["epub", "md", "pdf", "xslx"], version = "^0.15.10"}
nltk = "^3.9.1"
langchain = "^0.3"
langchain-community = "^0.3"
langchain-core = "^0.3"
langchain-text-splitters = "^0.3"
langchain-anthropic = "^0.2"
langchain-chroma = "^0.1"
langchain-openai = "^0.2"
openpyxl = "^3.1.5"
beautifulsoup4 = "^4.12.3"
markdownify = "^0.13.1"
dynaconf = "^3.2.6"

System Info


System Information
------------------
> OS:  Linux
> OS Version:  #1 SMP PREEMPT_DYNAMIC Tue, 10 Sep 2024 14:37:32 +0000
> Python Version:  3.12.6 (main, Sep 12 2024, 21:12:04) [GCC 12.2.0]

Package Information
-------------------
> langchain_core: 0.3.5
> langchain: 0.3.0
> langchain_community: 0.3.0
> langsmith: 0.1.125
> langchain_anthropic: 0.2.1
> langchain_chroma: 0.1.4
> langchain_cli: 0.0.31
> langchain_openai: 0.2.0
> langchain_text_splitters: 0.3.0
> langserve: 0.3.0

Optional packages not installed
-------------------------------
> langgraph

Other Dependencies
------------------
> aiohttp: 3.10.5
> anthropic: 0.34.2
> async-timeout: Installed. No version info available.
> chromadb: 0.5.7
> dataclasses-json: 0.6.7
> defusedxml: 0.7.1
> fastapi: 0.115.0
> gitpython: 3.1.43
> gritql: 0.1.5
> httpx: 0.27.2
> jsonpatch: 1.33
> langserve[all]: Installed. No version info available.
> numpy: 1.26.4
> openai: 1.47.0
> orjson: 3.10.7
> packaging: 24.1
> pydantic: 2.9.2
> pydantic-settings: 2.5.2
> PyYAML: 6.0.2
> requests: 2.32.3
> SQLAlchemy: 2.0.35
> sse-starlette: 1.8.2
> tenacity: 8.5.0
> tiktoken: 0.7.0
> tomlkit: 0.12.5
> typer[all]: Installed. No version info available.
> typing-extensions: 4.12.2
> uvicorn: 0.23.2
paolomainardi commented 1 month ago

It can be fix by manually downgrading to 3.20.2, but I'm not sure this is the best way to go.

SannanOfficial commented 2 weeks ago

It is not the best way to go at all. Is there any work being done as of now to support protobuf 5 and higher? Is it a big issue to resolve?

To give some detail, I have a project that has other, just as important dependencies requiring protobuf v5 and higher, and I am sure many other people would as well. I can't remember exactly due to not being at my office right now, but these dependencies relate to either unstructured excel or PDF loader packages and/or the Flask web framework.

Imho, fixing this, ideally by upgrading the protobuf version should be urgent.