chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
13.36k stars 1.14k forks source link

[Bug]: Chroma cross-version incompatibility between 0.5.0 or lower and 0.5.1 or later #2377

Closed tazarov closed 1 week ago

tazarov commented 1 week ago

What happened?

A breaking change was introduced in 0.5.1 which makes new clients not able to communicate with older Chroma servers.

Source ref: https://discord.com/channels/1073293645303795742/1252910305038565527

Reproduction

Run an older server version:

docker run --rm -it -p 8000:8000 chromadb/chroma:0.5.0
git git@github.com:chroma-core/chroma.git && cd chroma

python
>>> import chromadb
>>> client  = chromadb.HttpClient()
>>> client.get_or_create_collection("test_collection")

Versions

Chroma 0.5.1+, OS: Any, Python: Any

Relevant log output

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[1], line 5
      1 import chromadb
      3 client  = chromadb.HttpClient()
----> 5 client.get_or_create_collection("test_collection")

File ~/experiments/chroma/chroma-taz-17-docs/chromadb/api/client.py:151, in Client.get_or_create_collection(self, name, metadata, embedding_function, data_loader)
    141 @override
    142 def get_or_create_collection(
    143     self,
   (...)
    149     data_loader: Optional[DataLoader[Loadable]] = None,
    150 ) -> Collection:
--> 151     return self._server.get_or_create_collection(
    152         name=name,
    153         metadata=metadata,
    154         embedding_function=embedding_function,
    155         data_loader=data_loader,
    156         tenant=self.tenant,
    157         database=self.database,
    158     )

File ~/experiments/chroma/chroma-taz-17-docs/chromadb/telemetry/opentelemetry/__init__.py:146, in trace_method.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    144 global tracer, granularity
    145 if trace_granularity < granularity:
--> 146     return f(*args, **kwargs)
    147 if not tracer:
    148     return f(*args, **kwargs)

File ~/experiments/chroma/chroma-taz-17-docs/chromadb/api/fastapi.py:288, in FastAPI.get_or_create_collection(self, name, metadata, embedding_function, data_loader, tenant, database)
    273 @trace_method(
    274     "FastAPI.get_or_create_collection", OpenTelemetryGranularity.OPERATION
    275 )
   (...)
    286     database: str = DEFAULT_DATABASE,
    287 ) -> Collection:
--> 288     return self.create_collection(
    289         name=name,
    290         metadata=metadata,
    291         embedding_function=embedding_function,
    292         data_loader=data_loader,
    293         get_or_create=True,
    294         tenant=tenant,
    295         database=database,
    296     )

File ~/experiments/chroma/chroma-taz-17-docs/chromadb/telemetry/opentelemetry/__init__.py:146, in trace_method.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    144 global tracer, granularity
    145 if trace_granularity < granularity:
--> 146     return f(*args, **kwargs)
    147 if not tracer:
    148     return f(*args, **kwargs)

File ~/experiments/chroma/chroma-taz-17-docs/chromadb/api/fastapi.py:218, in FastAPI.create_collection(self, name, metadata, embedding_function, data_loader, get_or_create, tenant, database)
    202 """Creates a collection"""
    203 resp_json = self._make_request(
    204     "post",
    205     "/collections",
   (...)
    211     params={"tenant": tenant, "database": database},
    212 )
    214 model = CollectionModel(
    215     id=resp_json["id"],
    216     name=resp_json["name"],
    217     metadata=resp_json["metadata"],
--> 218     dimension=resp_json["dimension"],
    219     tenant=resp_json["tenant"],
    220     database=resp_json["database"],
    221     version=resp_json["version"],
    222 )
    223 return Collection(
    224     client=self,
    225     model=model,
    226     embedding_function=embedding_function,
    227     data_loader=data_loader,
    228 )

KeyError: 'dimension'
tazarov commented 1 week ago

Interestingly, the cross-version tests did not catch this (to be investigated separately).

terilias commented 1 week ago

Hi @tazarov,

I just realized I'm experiencing the same issue when I am calling the list_collections() and was wondering what the cause of this error might be. What do you suggest? Should we wait for the next release that fixes this bug with the pull request you mentioned? When can we expect it to be released?

collections = self.http_chroma_client.list_collections()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/usr/local/lib/python3.11/site-packages/chromadb/api/client.py", line 177, in list_collections
    return self._server.list_collections(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/usr/local/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 143, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^

  File "/usr/local/lib/python3.11/site-packages/chromadb/api/fastapi.py", line 212, in list_collections
    collections.append(Collection(self, **json_collection))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

TypeError: Collection.__init__() got an unexpected keyword argument 'dimension'
tazarov commented 1 week ago

@terilias, i think is impactful enough issue that we'll release a new version shortly.

terilias commented 1 week ago

ok, thank you!

HammadB commented 1 week ago

Chroma does not try to maintain client/server version mistmatch compatibility while still pre 1.0 release. Please ensure your client/server are the same version. we will soon patch this specific instance however