Creating a collection in Chroma involves several steps:
Create the collection in sysdb
Create segments (metadata + Vector)
Create a Vector segment in sysdb
Create metadata segment in sysdb
If any of steps 2-3 fails, Chroma is left in an inconsistent state, with the collection in sysdb. A subsequent delete_collection or get_or_create_collection may fix the problem. However, a simple create_collection will return a UniqueConstraint error.
This is not a critical issue, as there are ways to work around it. However, it highlights the need for robust error handling, including but not limited to rollback.
Versions
Chroma 0.4.x and 0.5.x (single-node), Any OS or Python version
Relevant log output
Python 3.11.7 (main, Dec 30 2023, 14:03:09) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import chromadb
>>> client = chromadb.Client()
>>> try:
... client.create_collection("test",metadata={"hnsw:batch_size":100})
... except Exception as e:
... print(e)
...
Unknown HNSW parameter: hnsw:batch_size
>>> client.create_collection("test")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/tazarov/experiments/chroma/taz-sprint-14/chromadb/api/client.py", line 198, in create_collection
return self._server.create_collection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/tazarov/experiments/chroma/taz-sprint-14/chromadb/telemetry/opentelemetry/__init__.py", line 143, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/Users/tazarov/experiments/chroma/taz-sprint-14/chromadb/api/segment.py", line 173, in create_collection
coll, created = self._sysdb.create_collection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/tazarov/experiments/chroma/taz-sprint-14/chromadb/telemetry/opentelemetry/__init__.py", line 143, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/Users/tazarov/experiments/chroma/taz-sprint-14/chromadb/db/mixins/sysdb.py", line 220, in create_collection
raise UniqueConstraintError(f"Collection {name} already exists")
chromadb.db.base.UniqueConstraintError: Collection test already exists
Note: The above issue is reproducible for in-memory chroma single-node local or server (distributed not tested)
What happened?
Creating a collection in Chroma involves several steps:
If any of steps 2-3 fails, Chroma is left in an inconsistent state, with the collection in sysdb. A subsequent
delete_collection
orget_or_create_collection
may fix the problem. However, a simplecreate_collection
will return a UniqueConstraint error.This is not a critical issue, as there are ways to work around it. However, it highlights the need for robust error handling, including but not limited to rollback.
Versions
Chroma 0.4.x and 0.5.x (single-node), Any OS or Python version
Relevant log output