python-cachier / cachier

Persistent, stale-free, local and cross-machine caching for Python functions.
MIT License
564 stars 64 forks source link

Cachier doesn't consider varadic arguments (*args) in the cache key #230

Open trav-c opened 3 months ago

trav-c commented 3 months ago

The following is using cachier version 3.0.1 with python version 3.12.4 (system rpm package) on linux (Fedora 40 x86_64)

I have a function which accepts a variable number of arguments via *args, and after applying the @cachier decorator all calls to the function regardless of the passed arguments return which ever result was first cached, eg

cachier.set_default_params(stale_after=timedelta(seconds=500))

@cachier.cachier()
def get_gam_dataframe(*args):
    output = run_gam(*args):
    return pandas.read_csv(StringIO(output))

domains = get_gam_dataframe('print', 'domains')
users   = get_gam_dataframe('print', 'users', 'allfields')

Will return the same result for users as for domains. Setting the stale_after value to 0 seconds causes the expected data to be return (but obviously at the cost of not using the cache). Additionally adding cachier__skip_cache=True to the calls throws a TypeError :

Traceback (most recent call last):
  File "........./script.py", line 65, in <module>
    'domains': get_gam_dataframe('print', 'domains', cachier__skip_cache=True),
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "......./python3.12/site-packages/cachier/core.py", line 251, in func_wrapper
    else func(**kwargs)
         ^^^^^^^^^^^^^^
TypeError: gam_get_dataframe() got an unexpected keyword argument 'args'

In this case I'm able to work around the issue by simply changing the call signature to avoid using a variable argument list for the cached function, but the behavior with *args seems to be unexpected/a bug.

@cachier.cachier()
def get_gam_dataframe(args):
    output = run_gam(*args):
    return pandas.read_csv(StringIO(output))

domains = get_gam_dataframe(['print', 'domains'])
users   = get_gam_dataframe(['print', 'users', 'allfields'])
Borda commented 2 months ago

I think it is related to how we generate the unique key from arguments...