chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
14.31k stars 1.2k forks source link

[Bug]: Client creation hangs indefinitely on Azure, before raising an OperationalError #1108

Open ai-st opened 11 months ago

ai-st commented 11 months ago

What happened?

When creating a persistent client on Azure, either inside a Notebook or on a compute the clients hang-on for 16min after creation: client = chromadb.PersistentClient()

After more than 16 min getting this error message:

OperationalError: database is locked

Specifying a path doesn't help either, same results.

Versions

Chroma 0.4.9, Python 3.10, Ubuntu 20.04.6 LTS

Relevant log output

---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
Cell In[13], line 1
----> 1 persist_client = chromadb.PersistentClient()

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/__init__.py:106, in PersistentClient(path, settings)
    103 settings.persist_directory = path
    104 settings.is_persistent = True
--> 106 return Client(settings)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/__init__.py:145, in Client(settings)
    142 telemetry_client = system.instance(Telemetry)
    143 api = system.instance(API)
--> 145 system.start()
    147 # Submit event for client start
    148 telemetry_client.capture(ClientStartEvent())

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/config.py:268, in System.start(self)
    266 super().start()
    267 for component in self.components():
--> 268     component.start()

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py:93, in SqliteDB.start(self)
     91     cur.execute("PRAGMA foreign_keys = ON")
     92     cur.execute("PRAGMA case_sensitive_like = ON")
---> 93 self.initialize_migrations()

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/db/migrations.py:128, in MigratableDB.initialize_migrations(self)
    125     self.validate_migrations()
    127 if migrate == "apply":
--> 128     self.apply_migrations()

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/db/migrations.py:147, in MigratableDB.apply_migrations(self)
    145 def apply_migrations(self) -> None:
    146     """Validate existing migrations, and apply all new ones."""
--> 147     self.setup_migrations()
    148     for dir in self.migration_dirs():
    149         db_migrations = self.db_migrations(dir)

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py:149, in SqliteDB.setup_migrations(self)
    147 @override
    148 def setup_migrations(self) -> None:
--> 149     with self.tx() as cur:
    150         cur.execute(
    151             """
    152              CREATE TABLE IF NOT EXISTS migrations (
   (...)
    160              """
    161         )

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py:47, in TxWrapper.__exit__(self, exc_type, exc_value, traceback)
     45 if len(self._tx_stack.stack) == 0:
     46     if exc_type is None:
---> 47         self._conn.commit()
     48     else:
     49         self._conn.rollback()

File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/chromadb/db/impl/sqlite_pool.py:31, in Connection.commit(self)
     30 def commit(self) -> None:
---> 31     self._conn.commit()
HammadB commented 11 months ago

Hi, sounds like you are using Azure notebooks - I believe this is due to the FS that azure notebooks is using.

https://stackoverflow.com/questions/53226642/sqlite3-database-is-locked-in-azure

ai-st commented 11 months ago

Thanks! What would be a solution to be able to use the PeristentClient?

ai-st commented 11 months ago

@HammadB Thanks for providing guidance. Creating a temp dir seems to work, after which the contents can be copied manually to a more permanent destination:

import tempfile
from pathlib import Path

chroma_path = Path(tempfile.gettempdir()) / 'chroma_db'

client = chromadb.PersistentClient(path=str(chroma_path))

This works, however having a clearer error message, and especially not waiting for 16 mins to get it would be very helpful?

PhilLovesToCode commented 5 months ago

I am experiencing a similar issue on my Mac in Python 3.10. My script creates a PersistentClient, does a simple list_collections(), and finishes. However the Python script does not exit immediately. It hangs on for a minute or two before releasing. This is a concern because when the object hangs on like this, it affects scalability. And there is no explicit close() method for the object. Is this a bug or expected behavior?

smurli commented 5 months ago

+1 I am facing the same issue. Any fix?

henry8168 commented 3 weeks ago

same problem here, chromadb==0.5.1

mersu898 commented 3 weeks ago

Hi,

I had similar issue as well, with an Azure AppService containing ChromaDB, saving the files on a mounted Azure File Share.

I had files with data previously created, so I couldn't delete everything and start from scratch.

So far, the solution that has worked for me is (we're still monitoring to check if everything is fine):

Hope this helps.