Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Cache only returns the first run_id that populated the cache when calling get_run_ids() on a hamilton.caching.stores.sqlite.SQLiteMetadataStore object
Current behavior
When calling get_run_ids() on a hamilton.caching.stores.sqlite.SQLiteMetadataStore class object, only the first run_id that populated the cache is returned rather than all run_ids of the cache.
Stack Traces
Believe this is caused by this line here in hamilton/caching/stores/sqlite.py in which the return result object is being indexed to the first item only
execute the driver driver.execute(final_vars=[some_var], inputs={some_input})
execute the driver again grabbing a different final_var: driver.execute(final_vars=[some_other_var], inputs={some_input})
run hamilton.caching.stores.sqlite.SQLiteMetadataStore(".hamilton_cache").get_run_ids() to get the run_ids that populated cache
the expected/wanted behavior is that this command would return all run_ids - not just the first
Library & System Information
Using hamilton version 1.80.0 & python 3.12.7
Expected behavior
When calling get_run_ids() on a hamilton.caching.stores.sqlite.SQLiteMetadataStore class object, I'd expect a full list of run ids that have populated the cache to be returned rather than just the first run.
Cache only returns the first
run_id
that populated the cache when callingget_run_ids()
on ahamilton.caching.stores.sqlite.SQLiteMetadataStore
objectCurrent behavior
get_run_ids()
on ahamilton.caching.stores.sqlite.SQLiteMetadataStore
class object, only the firstrun_id
that populated the cache is returned rather than allrun_ids
of the cache.Stack Traces
Believe this is caused by this line here in
hamilton/caching/stores/sqlite.py
in which the returnresult
object is being indexed to the first item onlySteps to replicate behavior
with_cache()
driver.execute(final_vars=[some_var], inputs={some_input})
final_var
:driver.execute(final_vars=[some_other_var], inputs={some_input})
hamilton.caching.stores.sqlite.SQLiteMetadataStore(".hamilton_cache").get_run_ids()
to get the run_ids that populated cacheLibrary & System Information
1.80.0
& python3.12.7
Expected behavior
get_run_ids()
on ahamilton.caching.stores.sqlite.SQLiteMetadataStore
class object, I'd expect a full list of run ids that have populated the cache to be returned rather than just the first run.Additional context
Add any other context about the problem here.