Using front-end API is bugged on azure deployments

dre-gonzalez-l7 commented 2 years ago

We established a cache using the redis backend and used the front-end to explicitly define keys using...

CacheRegion.set(key: str, value: Any)

If a region does not have a key set (checked using CacheRegion.get(key, expiration_time=None, ignore_expiration=False) a value is set before being returned, as in the following:

Does this key have a value?
Yes: return the value
No: fetch a new value and store it in the cache key for next time

There was separate code that explicitly deleted specific keys using CacheRegion.delete(key: str). Think "separate UI that is manually able to purge the specific cache-key on-demand". Functionality worked and was verified using RedisInsight on macOS Monterey.

☝️ This implementation did not work for an unknown reason on deployment in azure

When the same logic was implemented using cache_on_arguments with invalidate, with a region that used a function_key_generator, the implementation worked on azure deployments. Furthermore, log messages were seen in the process output (which wasn't seen before using the front-end API).

Not sure what the issue was. Feel free to ask for more info.

zzzeek commented 2 years ago

think about things that change on OSes. "keys", like strings, have case sensitivity, encodings, etc. timestamps / dates have different timezone assumptions.

other than that there's actually almost no detail here at all.

"using RedisInsight on macOS Monterey." - is this...the client? or running redis server also?

"azure" - what kind of redis do they host? I have no idea.

"did not work" - what does that mean? core dump ? stack trace? wrong data ?

overall, azure's a hosted environment and mac OS is not, so we would assume something changes about redis server and/or client. wouldn't be a dogpile issue directly.

dre-gonzalez-l7 commented 2 years ago

Thanks for responding so quickly! I'm looking into getting you some more information 👍

dre-gonzalez-l7 commented 2 years ago

To give a bit of clarification on the code itself, the following is a simplified version of what did not work:

from dogpile.cache import make_region

CACHE = make_region().configure(
    'dogpile.cache.redis',
    arguments = {
        'host': 'localhost',
        'port': 6379,
        'db': 0,
        'redis_expiration_time': 60*60*24, # 24 hours
        'distributed_lock': True,
        'thread_local_lock': False
    }
)
CACHE_KEY = 'key_{}'

def delete_cache_key(id):
    CACHE.delete(CACHE_KEY.format(id))

def get_data(id):
    cache_token = CACHE.get(CACHE_KEY.format(id))
    if cache_token:
        return cache_token

    this_data = get_some_data(id) # Some 'get_some_data' function
    CACHE.set(CACHE_KEY.format(id), this_data)
    return this_data

The following is a simplified version of what did work:

from dogpile.cache import make_region

def my_key_generator(namespace, fn, **kwargs):
    fname = fn.__name__
    def generate_key(*arg):
        return namespace + '_' + fname + '_' + '_'.join(str(s) for s in arg)
    return generate_key

CACHE = make_region(
        function_key_generator=my_key_generator,
        key_mangler=lambda key: "dogpile:" + key,
    ).configure(
        'dogpile.cache.redis',
        arguments = {
            'host': 'localhost',
            'port': 6379,
            'db': 0,
            'redis_expiration_time': 60*60*24, # 24 hours
            'distributed_lock': True,
            'thread_local_lock': False
        }
)

def delete_cache_key(id):
    get_data.invalidate(id)

@CACHE.cache_on_arguments(namespace='my_namespace')
def get_data(id):
    return get_some_data(id) # Some 'get_some_data' function

Elsewhere in the code, our functionality was something akin to the following:

...
# At load
get_data(some_id)
...

# If button was clicked...
delete_cache_key(some_id)
get_data(some_id)

"keys", like strings, have case sensitivity, encodings, etc

A good thing to be aware of, but I believe redis keys are binary safe... ?

other than that there's actually almost no detail here at all.

Admittedly true. The thought about this issue is that the deployment environment was consistent between what did not work and what did work (leading to the thinking that there might be a bug)

"did not work"

what does that mean? core dump ? stack trace? wrong data ?

Unfortunately, we were not aware of any details regarding specifics (no traces/dumps), other than we weren't getting the data we expected.

"using RedisInsight on macOS Monterey."

is this...the client? or running redis server also?

This was just the client, and a line-by-line debugging session indicated that the redis db was, indeed, clearing the specified key before getting some new data.

"azure"

what kind of redis do they host? I have no idea.

Admittedly, I do not either. Our dockerfile mentions using debian:11 with postgresql13 and python 3.9

sqlalchemy / dogpile.cache

Using front-end API is bugged on azure deployments #229