Open benmkw opened 1 year ago
We took a crack at this in Pyston with some pretty good results. One of the main reasons why these dunder-using features are slow is because they use uncached attribute lookups, which are slow. Pyston / modern CPython have attribute caches that the interpreter can use, but absent a more-powerful specialization framework (which we had in Pyston v1 but not in Pyston v2), we need to find another place to cache this, and in Pyston we cache some of the dunder attributes directly on the type object.
In CPython we also cache dunder attribute lookups in the type object. However, __add__
isn't one of them. So far I think we only have __getitem__
and a few others.
Specialization of binary operations is a work-in-progress, and adding specializations for user defined dunder methods is one possible improvement. There is a slight complication that it will take more inline cache space, so the small speedup for specializing for this case may be outweighed by the slowdown caused by bigger code objects.
@brandtbucher is this on your to do list?
I re-read this post https://lucumr.pocoo.org/2014/8/16/the-python-i-would-like-to-see/ and while I'm not deep enough in the details to have an opinion on the specifics, it seems like this simple benchmark is worth optimising for/ seems like a relevant issue for python still today:
(from the post)
reproduction on my system:
so indeed 8 years after the post appeared this issue seems to still be relevant (although at a different scale)