sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.45k stars 480 forks source link

overloading a cached_method in inherting/derived class #21281

Open dkrenn opened 8 years ago

dkrenn commented 8 years ago
class P(object):
    @cached_method
    def blub(self):
        print('called P.blub')
        return 3

class Q(P):
    def blub(self):
        print('called Q.blub')
        super(Q, self).blub()
        return 7

p = P()
q = Q()
print p.blub()
print p.blub()
print '---'
print q.blub()
print q.blub()

results in

called P.blub
3
3
---
called Q.blub
called P.blub
7
3

The last line should clearly be the output 7 again (and not the 3 of the base class).

CC: @orlitzky

Component: misc

Issue created by migration from https://trac.sagemath.org/ticket/21281

dkrenn commented 8 years ago

Replying to @dkrenn:

        super(Q, self).blub()

FYI, replacing this line by P.blub(self) results in

TypeError: __call__() takes exactly 0 positional arguments (1 given)
nbruin commented 8 years ago
comment:2

This is a side-effect of an intentional optimization in CachedMethodCallerNoArgs: if possible, it reaches into the instance dictionary and puts a higher optimized "value returner" there, to override the method lookup to be particularly fast:

sage: q = Q()
sage: q.__dict__
{}
sage: q.blub()
called Q.blub
called P.blub
7
sage: q.__dict__
{'blub': Cached version of <function blub at 0x7f83c8e59e60>}

The override happens already on the method lookup, so you need to reach in quite deep to avoid it:

sage: q = Q()
sage: super(Q,q).blub
Cached version of <function blub at 0x7f83c7d0f050>
sage: q.__dict__
{'blub': Cached version of <function blub at 0x7f83c7d0f050>}

(as soon as super(Q,q).blub has executed, any future q.blub will yield the cached version) Perhaps the simplest work-around is to manually restore the attribute:

class Q(P):
    @cached_method
    def blub(self):
        t=self.blub
        print('called Q.blub')
        super(Q, self).blub()
        self.blub=t
        return 7

(here illustrated with an overriding method that is a cached method as well -- the caching is not defeated)

This optimization used to be quite important, when cython classes didn't participate in the method lookup cache. It made these methods a LOT faster. Perhaps it isn't so super-important now any more. It's a bit of a pain to have to jump through such hoops depending on implementation details of "super" methods you still want to call ... (not to mention that on cython classes the caching mechanism might work differently!)

dkrenn commented 8 years ago
comment:3

Replying to @nbruin:

This is a side-effect of an intentional optimization in CachedMethodCallerNoArgs [...]

Ok. Actually, I have an input argument (but this was removed to get a minimal non-working example). So I have now boiled my code down to the following:

class P(object):
    @cached_method
    def __getitem__(self, n):
        print "P.__getitem__", n
        return n

class Q(P):
    @cached_method
    def __getitem__(self, n):
        print "Q.__getitem__", n
        return super(Q, self).__getitem__(n)

q = Q()
q[2]

This recurses in Q's __getitem__:

Q.__getitem__ 2
Q.__getitem__ 2
Q.__getitem__ 2
Q.__getitem__ 2
...
mkoeppe commented 8 years ago
comment:4

Related to / dup of #17201?

nbruin commented 8 years ago
comment:5

Replying to @dkrenn:

Ok. Actually, I have an input argument (but this was removed to get a minimal non-working example). So I have now boiled my code down to the following:

OK, I guess the problem also occurs with CachedMethodCaller:

sage: p=P()
sage: p.__dict__
{}
sage: p[1]
P.__getitem__ 1
1
sage: p.__dict__

{'__getitem__': Cached version of <function __getitem__ at 0x7eff7427d7d0>,
 '_cache____getitem__': {((1,), ()): 1}}

so it's the same story: CachedMethodCaller inserts a binding into the instance dictionary. This was a good idea when cython objects did not participate in the MRO cache, but now they do, so the time gain from this insertion is probably not so great anymore.

It seems a little excessive that the cache is also inserted into the instance dictionary. If you're going to put a special method into the instance dict anyway, you might as well store the cache in that method (in the form of a closure or a slot on the method) to save another dict lookup.

I'm a little mystified how we end up with an infinite recursion, though. Apparently the lookup for __getitem__ happens a little differently from other method names (perhaps because it's a method that the indexing slot on "object" objects calls to). It looks like the line super(Q,self).__getitem__ ends up fetching the thing stored on the instance dict of self already.

With other method names, you don't get that problem:

class P(object):
    @cached_method
    def U(self, n):
        print "P.U", n
        return n

class Q(P):
    @cached_method
    def U(self, n):
        print "Q.U", n
        b=super(Q, self).U(n+1)
        return b

However, this example does show how overriding and then calling cached methods doesn't work because both methods will be writing into the same cache on the instance object.

So in any case, if you're overriding a cached method with a cached method, you should be punching through the caching wrapper, so something along the lines of:

super(U,self).U.f(self,n+1)

In addition, for __getitem__ it seems necessary to do some more work. Probably the easiest is to just forget about "super" (you need very detailed information about whether things are cached anyway, so you're not going to survive big changes to the MRO anyway) and do:

class P(object):
    @cached_method
    def __getitem__(self, n):
        print "P.__getitem__", n
        return n

class Q(P):
    @cached_method
    def __getitem__(self, n):
        print "Q.__getitem__", n
        return P.__getitem__.f(self,n)

It's a mess, but basically necessarily so: caches are stored on instances and identified by the method name, so you can't just willy-nilly override cached methods and then call the super version. It will always be a bit messy.

dkrenn commented 8 years ago
comment:6

Replying to @nbruin:

Replying to @dkrenn:

Ok. Actually, I have an input argument (but this was removed to get a minimal non-working example). So I have now boiled my code down to the following:

OK, I guess the problem also occurs with CachedMethodCaller: [...] So in any case, if you're overriding a cached method with a cached method, you should be punching through the caching wrapper, so something along the lines of:

super(U,self).U.f(self,n+1)

Ok.

In addition, for __getitem__ it seems necessary to do some more work. Probably the easiest is to just forget about "super" (you need very detailed information about whether things are cached anyway, so you're not going to survive big changes to the MRO anyway) and do:

class P(object):
    @cached_method
    def __getitem__(self, n):
        print "P.__getitem__", n
        return n

class Q(P):
    @cached_method
    def __getitem__(self, n):
        print "Q.__getitem__", n
        return P.__getitem__.f(self,n)

Good. This workaround works for me.

It's a mess, but basically necessarily so: caches are stored on instances and identified by the method name, so you can't just willy-nilly override cached methods and then call the super version. It will always be a bit messy.

Ok.

Thanks you for your explaination and workaround.

orlitzky commented 3 years ago
comment:7

Looks like we have duplicates in #17201 and #31421.