faster-cpython / ideas

1.67k stars 49 forks source link

Make class lookup cache per-class, instead of global. #532

Open markshannon opened 1 year ago

markshannon commented 1 year ago

The global "method cache" used by PyType_Lookup() has 4k entries. This means that it is too sparse for most programs , but too dense for some very large programs.

The problem with small caches is collisions. Two different names for the same class are much more likely to collide in a 16 entry per-class cache than in a 4000 entry global cache.

However, if the per-class cache serves not as hash-table, but an indexed array for inline caches, this isn't a problem.

Having per-class caches will allow us to shrink the size of inline caches a lot, but we need to take care not to wreck the performance of PyType_Lookup(). If we were to reduce the number of calls to PyType_Lookup() by an order of magnitude or more, see https://github.com/faster-cpython/ideas/issues/517, then performance of PyType_Lookup() becomes relatively unimportant, as it would mainly serve as a lookup when specializing.

Fidget-Spinner commented 1 year ago

However, if the per-class cache serves not as hash-table, but an indexed array for inline caches, this isn't a problem.

What about internal (C API or internal C function) calls to _PyType_Lookup. Wouldn't those suffer?

markshannon commented 1 year ago

What about internal (C API or internal C function) calls to _PyType_Lookup. Wouldn't those suffer?

Probably yes, but it won't matter if there aren't many of them. Hence #517

Fidget-Spinner commented 1 year ago

I have a PR up at https://github.com/python/cpython/pull/100869.