'database is locked' error when using CachingGraphAdapter on Azure Linux ML Compute

giladrubin1 commented 1 month ago

Current behavior

When trying to use hamilton.experimental.h_cache.CachingGraphAdapter on an Azure Linux ML Compute, a 'database is locked' error occurs after running for about a minute.

Stack Traces

---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
Cell In[9], line 4
      1 from hamilton.plugins import h_diskcache
      2 from hamilton.driver import Builder
----> 4 cache_adapter = h_diskcache.DiskCacheAdapter()
      6 builder = (Builder()
      7            .with_adapters(cache_adapter)
      8            )

File /anaconda/envs/pdf-env/lib/python3.10/site-packages/hamilton/plugins/h_diskcache.py:84, in DiskCacheAdapter.__init__(self, cache_vars, cache_path, **cache_settings)
     82 self.cache_vars = cache_vars if cache_vars else []
     83 self.cache_path = cache_path
---> 84 self.cache = diskcache.Cache(directory=cache_path, **cache_settings)
     85 self.nodes_history: Dict[str, List[str]] = self.cache.get(
     86     key=DiskCacheAdapter.nodes_history_key, default=dict()
     87 )  # type: ignore
     88 self.used_nodes_hash: Dict[str, str] = dict()

File /anaconda/envs/pdf-env/lib/python3.10/site-packages/diskcache/core.py:478, in Cache.__init__(self, directory, timeout, disk, **settings)
    476 for key, value in sorted(sets.items()):
    477     if key.startswith('sqlite_'):
--> 478         self.reset(key, value, update=False)
    480 sql(
    481     'CREATE TABLE IF NOT EXISTS Settings ('
    482     ' key TEXT NOT NULL UNIQUE,'
    483     ' value)'
    484 )
    486 # Setup Disk object (must happen after settings initialized).

File /anaconda/envs/pdf-env/lib/python3.10/site-packages/diskcache/core.py:2438, in Cache.reset(self, key, value, update)
   2436         update = True
   2437     if update:
-> 2438         sql('PRAGMA %s = %s' % (pragma, value)).fetchall()
   2439     break
   2440 except sqlite3.OperationalError as exc:

OperationalError: database is locked

Steps to replicate behavior

Set up an Azure Linux ML Compute environment
Open a VSCode notebook

Run the following code:

from hamilton.plugins import h_diskcache
from hamilton.driver import Builder

cache_adapter = h_diskcache.DiskCacheAdapter()

builder = (Builder()
          .with_adapters(cache_adapter)
         )

Wait for about a minute
Observe the 'database is locked' error

Library & System Information

Environment: Azure Linux ML Compute
Python version: 3.10 (based on the error stack trace)

Expected behavior

The CachingGraphAdapter should initialize without a 'database is locked' error.

Additional context

After the error occurs, the cache.db file cannot be deleted until the kernel is restarted. This suggests that the file remains locked even after the error occurs.

skrawcz commented 1 month ago

@giladrubin1 I think this might be an azure issue - https://github.com/microsoft/autogen/issues/79

Can you try specifying a different location for the cache? or trying https://github.com/microsoft/autogen/issues/79#issuecomment-1744086814 if that makes sense?

giladrubin1 commented 1 month ago

Yes, - it seems like it's an sqlite issue on AzureML Compute. I'll change the directory of the cache to be outside of the compute filesystem, hopefully it'll work. Thanks!

DAGWorks-Inc / hamilton