Closed xmcp closed 3 weeks ago
Thanks for digging into this. I think this can (should?) also be solved by more carefully setting the get_cache_key
. Using this solves the issue:
def get_cache_key(request):
# EDIT: the cache key mush always be a string
return f"{request.url};{request.headers.get('Accept-Encoding')}"
The Accept-Encoding
header will be directly translated to chosen_algorithm
once _choose_compress_algorithm
is called, so the end result is the same as your PR #54 .
I agree the current situation needs to be improved, either by merging your PR, or updating the docs with better examples and warn about this pitfall.
Setting the cache key based on Accept-Encoding may cause redundancy in the cache, since different values of Accept-Encoding may lead to the same chosen_algorithm. This is not a huge problem, but I will argue that my PR is the more technically correct.
I am inclined to agree. Note that in practice, there are not many Accept-Encoding
values out there, as this set "statically" set per-browser.
Hope I'm not dogpiling, but I 100% agree that the solution that @xmcp provided is more elegant. I've implemented the cache key trick mentioned above and while it works, the new fix is highly desirable for those of us using both flask-caching and flask-compress in production. Huge kudos.
After more testing of this, it turns out releasing this would have broken quite a lot of things (for starters, pretty much all my projects 😉).
Flask-Caching relies on cachelib. If you look at the documentation of cache.get, you will see that the key must be of str
type.
It does not show in the example above, as the default simple
cache relies on a dict
, and using a tuple
as key will "just work", even though this will not pass static type checking.
The trouble begins when you use another cache type, personally I rely often on FileSystemCache
, as it allows flask applications served by multiple processes to share a common cache. If you try this, it will fail with the above patch, as there is code that builds a path from the key, and this fails if the key is not a str
.
The error I got when testing this was not obvious and did not match the raise TypeError
from cachelib
:
UnboundLocalError: cannot access local variable 'bkey_hash' where it is not associated with a value
I was a bit unlucky here, as the clear error in cachelib
when the key is not a str
was only added in version 0.10, and the latest release of flask-caching still pins cachelib<0.10
.
So the next steps:
str
FileSystemCache
In hindsight I should have required a test before merging, even though on simple
cache it would have passed.
Oops sorry for that. I only tested it with simple
cache.
@xmcp are you planning to open another PR?
I am busy with other things in the following days. I am okay with minor changes like casting the type to str
, but I won't have time to contribute test cases anyway. You can fix this issue if you like to.
Fixed by e3aeec8de4a47cffbbee1d6c7ab2af5f048ea11c
For the record release 1.16 shipped with that fix.
Currently the
after_request
function finds the cached response throughkey = self.cache_key(request)
.However, this key does not consider
chosen_algorithm
, so if a request wants brotli, the brotli response body will be cached, and following gzip requests will be incorrectly served with brotli body.Step to reproduce: