Use functools.partial in AttributeAnnotations?

jamadden commented 7 years ago

Five of the methods in this class make calls to getattr with constant arguments. Replacing these with a single functools.partial appears to be faster (especially under Python 2.7 and PyPy2), in both cases when the object has and does not have annotations.

No annotations:

In [2]: import functools

In [3]: _GLOBAL = {}

In [4]: class A(object):
   ...:     def __init__(self, obj):
   ...:         self.obj = obj
   ...:         self.partial = functools.partial(getattr, self.obj, '__annotations__', _GLOBAL)
   ...:     def using_getattr(self):
   ...:         getattr(self.obj, '__annotations__', _GLOBAL)
   ...:     def using_partial(self):
   ...:         self.partial()
   ...:

In [5]: a = A(object())
In [8]: %timeit a.using_getattr()
1000000 loops, best of 3: 740 ns per loop

In [9]: %timeit a.using_partial()
1000000 loops, best of 3: 670 ns per loop

With annotations:

In [10]: class WithAnnot(object):
    ...:     def __init__(self):
    ...:         self.__annotations__ = {}
    ...:

In [11]: a = A(WithAnnot())

In [12]: %timeit a.using_getattr()
1000000 loops, best of 3: 439 ns per loop

In [13]: %timeit a.using_partial()
1000000 loops, best of 3: 376 ns per loop

PyPy also benefits from the partial approach:

In [4]: a = A(object())

In [5]: %timeit a.using_getattr()
10000000 loops, best of 7: 129 ns per loop

In [6]: %timeit a.using_partial()
10000000 loops, best of 7: 65.3 ns per loop

In [8]: a = A(WithAnnot())

In [9]: %timeit a.using_getattr()
10000000 loops, best of 7: 136 ns per loop

In [10]: %timeit a.using_partial()
10000000 loops, best of 7: 68.9 ns per loop

Here are the times for Python 3.5 for no annotations

In [7]: %timeit a.using_getattr()
550 ns ± 4.08 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [8]: %timeit a.using_partial()
517 ns ± 2.57 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

And with annotations:

In [10]: %timeit a.using_getattr()
259 ns ± 2.24 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [11]: %timeit a.using_partial()
233 ns ± 3.99 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Of course, constructing the partial object has a cost, roughly doubling the construction time for the object under all tested versions and implementations of Python (489 -> 887 on CPython 2.7, 468 -> 939 on CPython 3.5, and 108 -> 220 under PyPy2). We could hide that behind a lazy attribute if desired, but that still leaves the question: are enough keys accessed on a typical annotation object that the increased construction cost (no matter where we hide it) pays for itself? Under CPython 2, the break even point is about 4 keys; it's much higher on CPython 3 and somewhat lower on PyPy.

(Or alternately, is the decreased code repetition and increased clarity worth the small performance trade off for annotations that have very few keys accessed?)

icemac commented 7 years ago

I do not know that many applications which use 4 or more different annotation keys. There are some but I think they are a minority.

@jamadden When does the construction of the partial function happen? When calling the annotations adapter?

jamadden commented 7 years ago

It could be at either the construction of the AttributeAnnotations object or it could be deferred until first used, a-la @Lazy or the like.

But I think that even in applications that use multiple annotation keys, that's usually spread out over the code base, and it's unlikely that the same AttributeAnnotations object is reused. It's more often that each use site does something like IAnnotations(object)['mykey']...which is exactly the pattern that the factory() function encourages, so even for applications that use the same key in different places the underlying annotation adapter tends to get recreated. So it's hard to see how this can be a win without caching the partial somewhere external to the AttributeAnnotations object.

jamadden commented 7 years ago

We could use a closure as a method on self, similar to the partial, but using try/except instead:

In [5]: class A(object):
   ...:     def __init__(self, obj):
   ...:         self.obj = obj
   ...:         def _annot_def():
   ...:             try:
   ...:                 return obj.__annotations__
   ...:             except AttributeError:
   ...:                 return _GLOBAL
   ...:         self.using_partial = _annot_def

This makes things massively faster when the __annotations__ attribute is present, at the expense of being 25-35% slower on CPython when the attribute is missing (but faster across the board on PyPy):

Method	Python 2.7	Python 3.5	PyPy 2.7
getattr - no attribute	740ns	550ns	129ns
partial - no attribute	670ns	517ns	65.3ns
closure - no attribute	1370ns	738ns	116ns
getattr - with attribute	439ns	259ns	136ns
partial - with attribute	376ns	233ns	68.9ns
closure - with attribute	281ns	175ns	33ns

Construction is only 40-50% slower instead of 80-100% slower under CPython:

Method	Python 2.7	Python 3.5	PyPy
getattr	489ns	468ns	49.4ns
partial	887ns	939ns	220ns
closure	767ns	693ns	61.8ns

In CPython 2.7, that gives an overhead of 767 - 489 = 278ns. For the case that the attribute is present, the total time to access an annotation using getattr would have been 489 + 439 = 928; with the closure that becomes 767 + 281 = 1048, about 12% slower. But a second access in the same annotation object is free.

In CPython 3.5, using getattr the time is 748, and with the closure its 868, about 15% slower.

Under PyPy, the time would go from 169 to 94ns, so we'd actually get faster.

Still, given the typical pattern of only accessing one key per AttributeAnnotations instance, CPython would generally get slightly slower.

jamadden commented 7 years ago

If we made a local copy of the annotations in the constructor, we would eliminate the boilerplate from all five of the methods, and we'd change the __delitem__ and __setitem__ methods from using a try/except AttributeError to be if self.__annotations__ is _EMPTY_STORAGE:

In [18]: class Copy(object):
    ...:    
    ...:     def __init__(self, obj):
    ...:         self.obj = obj # For BWC
    ...:         self.__annotations__ = getattr(obj, '__annotations__', _GLOBAL)

This results in the best construction times as well as the best access times, no matter whether or not the attribute is present:

Method	Python 2.7	Python 3.5	PyPy
init - no attribute	1030ns	962ns	142ns
init - with attribute	674ns	594ns	144ns
access	293ns	155ns	26.8ns

For the case where the attribute is present, the total access times are then:

Method	Python 2.7	Python 3.5	PyPy
getattr (original)	928ns	748ns	94ns
direct (copy)	967ns	749ns	170ns

So they're basically a wash on CPython 2 and 3, and PyPy gets a bit slower. I will note though, that the times under CPython 2 are highly variable, as they are under PyPy (and this kind of micro-benchmarking is notoriously difficult on PyPy).

As a further point, here's a version that uses try/except for construction. It slows down construction when the attribute is missing under CPython, but speeds it up for all cases under PyPy:

In [35]: class CopyWithExc(object):
    ...:     def __init__(self, obj):
    ...:         self.obj = obj
    ...:         try:
    ...:             self.__annotations__ = obj.__annotations__
    ...:         except AttributeError:
    ...:             pass

Method	Python 2.7	Python 3.5	PyPy
init - no attribute	1690ns	1070ns	130ns
init - with attribute	620ns	516ns	85ns
total access - with attribute	913ns	671ns	112ns

I would say either variant is a win from a code standpoint; which one to choose just depends on how much of a penalty the missing attribute case should get. I think the getattr approach is probably a good balance between the two.

jamadden commented 7 years ago

I would say either variant is a win from a code standpoint

Argh. That potentially falls down if multiple different AttributeAnnotations objects are in use and modified for the same object that did not initially have any __annotations__. Is that a scenario to worry about? There are no test cases that exercise that scenario yet.

For example, currently:

class O(object): pass
o = O()
annot1 = AttributeAnnotations(o)
annot2 = AttributeAnnotations(o)

annot1['key'] = 42
assert len(annot2) == 1

But under the proposal:

class O(object): pass
o = O()
annot1 = AttributeAnnotations(o)
annot2 = AttributeAnnotations(o)

annot1['key'] = 42
assert len(annot2) == 0 # Not visible here, yet

tseaver commented 7 years ago

Why avoid scribbling on object.__annotations__ if it is not already set? Constructing an AttributeAnnoations instance for the object is a pretty clear sign that the app is ready / willing to write to that attribute.

jamadden commented 7 years ago

I agree that eagerly writing __annotations__ would make things simpler. But being willing to write to that attribute doesn't necessarily mean the app wants or expects to write to that attribute right now. The code might be something like:

if 'my-key' in IAnnotations(obj):
   # We've done something special with this object before, handle that case
# There is no else, we're not adding the key here.

Or simply:

things = IAnnotations(obj).get('my-key', ())
for thing in things:
   ...

One concrete example I have of that in an app I work on has to do with invitation codes. Most users will never have invitations sent to them, but we need to check for each user when a certain page is accessed for that user.

The main benefit of delaying adding __annotations__, presumably, has to do with Persistent objects, and avoiding extra database writes (the write-on-read scenario, which is often a problem). (For the (rare, but they do exist) apps that do transaction.abort() for what are supposed to be side-effect free transactions, such as a HTTP GET request---thus enforcing HTTP semantics---eagerness leads to extra work creating and assigning an attribute that will be thrown away.)

jamadden commented 7 years ago

(Oh, and before anyone asks: No, I don't have any evidence that AttributeAnnotations is a bottleneck in our code. I just noticed the duplication of identical function calls and my coding instincts were offended 😄 ; being faster would just be a nice side benefit.)

zopefoundation / zope.annotation

Use functools.partial in AttributeAnnotations? #8