faster-cpython / ideas

1.67k stars 49 forks source link

Reduce use of `PyTypeLookup` with more effective (inline) caching #517

Open markshannon opened 1 year ago

markshannon commented 1 year ago

The stats show some 3.5 billion calls to PyTypeLookup (sum of all method cache hits and misses).

Using this branch: https://github.com/faster-cpython/cpython/tree/type-cache-stats-plus I added a "why" field to all calls to PyTypeLookup

Here are the results. I've add callers for the most common. The others can be found by looking at the above branch.

Stats line Count Caller
Method cache lookup 0 0
Method cache lookup 1 16,864
Method cache lookup 2 16,864
Method cache lookup 3 11,863,387 analyze_descriptor
Method cache lookup 4 805,230
Method cache lookup 5 98,292
Method cache lookup 6 241,750
Method cache lookup 7 0
Method cache lookup 8 0
Method cache lookup 9 0
Method cache lookup 10 0
Method cache lookup 11 0
Method cache lookup 12 11,584,980 method_getattro
Method cache lookup 13 92,208,240 instancemethod_getattro
Method cache lookup 14 507,347,592 lookup_maybe_method
Method cache lookup 15 0
Method cache lookup 16 0
Method cache lookup 17 13,909,244 _Py_slot_tp_getattr_hook
Method cache lookup 18 13,909,244 _Py_slot_tp_getattr_hook
Method cache lookup 19 18,823,380 slot_tp_descr_get
Method cache lookup 20 317,617,134 _PyObject_GetMethod
Method cache lookup 21 1,393,858,961 _PyObject_GenericGetAttrWithDict
Method cache lookup 22 140,120,429 _PyObject_GenericSetAttrWithDict
Method cache lookup 23 0
Method cache lookup 24 657,181,229 type_getattro
Method cache lookup 25 590,504,780 type_getattro
Method cache lookup 26 4,660
Method cache lookup 27 0
Method cache lookup 28 0
Method cache lookup 29 0

The 1.3B calls to _PyObject_GenericGetAttrWithDict are probably from unspecialized calls to LOAD_ATTR. Improved specialization of LOAD_ATTR should remove most of them.

The large number of calls to type_getattro is surprising and worth investigating, especially as each call to type_getattro causes two calls to PyTypeLookup. type_getattro looks expensive.

Method lookup also seems to be a source of lookups. Improved specialization of LOAD_ATTR should help here as well.

Fidget-Spinner commented 1 year ago

Two of those we probably can't do away:

_PyObject_GetMethod (C API Call)
lookup_maybe_method (I forgot where this is from but I vaguely recall it being used by our internal C stuff too)
Fidget-Spinner commented 1 year ago

Also the biggest source of failures is actually loading attributes from managed dictionaries. Failures for "method" actually also counts failures for "has managed dict". So "has_managed_dict" is actually double counted! (We call SPEC_FAIL_ATTR_HAS_MANAGED_DICT, then SPEC_FAIL_ATTR_METHOD next).

https://github.com/faster-cpython/ideas/blob/main/stats/pystats-2022-12-12-python-1583c6e.md#specialization-attempts-5

I will work on this specialisation.

ericsnowcurrently commented 1 year ago

In case it matters, note that the pickle module (and copy) looks up __reduce__, etc. on the object rather than on its type. That said, there is no tp_reduce.

Fidget-Spinner commented 1 year ago

In case it matters, note that the pickle module (and copy) looks up __reduce__, etc. on the object rather than on its type. That said, there is no tp_reduce.

Yeah this is unavoidable in many modules. For example print itself does one or two lookups (if I recall, one to sys.stdout and another one elsewhere).