Open vicentealencar opened 9 years ago
Coming to think of it, I think using repr
for objects that do not provide a custom implementation of __repr__
, and therefore use object
's implementation, is dangerous and it is currently doing more harm than good.
object
's implementation will just print the class name along with the memory reference to that object (e.g.: '<object object at 0x1101140e0>'
). This string doesn't provide flask-cache
with information regarding the object's internal state. Therefore, mutating one of the object's properties won't change the result of calling repr
on that object, which might cause the memoized version of the method to return incorrect results (at least until the cache expires)
E.g.:
class X(object):
def __init__(self):
self.x = 0
@cache.memoize(10)
def cached_method(self, y):
return self.x + y
test = X()
test.cached_method(1)
# Returns 1
test.x = 2
test.cached_method(1)
# should return 3, but returns 1
A handful of bugs like that were being introduced to our project's code base. The issue above can be easily fixed by providing the class with a custom implementation of __repr__
that correctly outputs the object's internal state as part of its return value:
class X(object):
def __repr__(self):
return "X(%s)" % self.x
def __init__(self):
self.x = 0
@cache.memoize(10)
def cached_method(self, y):
return self.x + y
test = X()
test.cached_method(1)
# Returns 1
test.x = 2
test.cached_method(1)
# Returns 3
In order to prevent issues like the one described on the first example from happening again, we've overridden flask-cache
's memoize
method to raise a ValueError
whenever a memoized function parameter uses object
's implementation of __repr__
:
class SafeCache(Cache):
def memoize(self, *args, **kwargs):
unsafe_memoized = super(JusBrasilCache, self).memoize(*args, **kwargs)
def safe_memoized(func):
decorated_function = unsafe_memoized(func)
unsafe_make_key = decorated_function.make_cache_key
def safe_make_key(f, *args, **kwargs):
is_cacheable = lambda obj: hasattr(obj, '__repr__') and\
obj.__repr__() != object.__repr__(obj)
forbidden_params = filter(
lambda x: not is_cacheable(x),
list(args) + kwargs.values())
if forbidden_params:
raise ValueError("memoize cannot be used with functions with non-primitive parameters that do not provide a specific implementation of __repr__: %s" % forbidden_params)
else:
return unsafe_make_key(f, *args, **kwargs)
decorated_function.make_cache_key = safe_make_key
return decorated_function
return safe_memoized
I believe the above change should be incorporated by flask-cache, in order to prevent others from introducing bugs to their codebases. I can send flask-cache a pull request if we determine the solution above is appropriate.
The lib, in its current state, makes it very easy for developers to memoize instance/class methods without realizing that the
self
/cls
parameter will be used to calculate the cache key. This issue is very tricky to spot, since developers don't usually go check whether or not memoization is working properly. However, conceptually, I do agree that it makes sense to useself
/cls
to calculate the cache key, since the instance's internal state could potentially be used by the given method.I propose raising an exception with an informative error message every time
memoize
is used to decorate instance/class methods whose instances don't implement__cacherepr__
(per the current suggestion written in the comments). For the purposes of backwards compatibility, this exception could be swallowed by setting a configuration property tofalse
(e.g.:CACHE_CHECK_INSTANCE_MEMOIZATION
)