Open markshannon opened 8 months ago
we should have the compiler inject a special attribute, say __expected_attributes__,
Do you mean that this should be added as a field to PyTypeObject
?
No need to mess with the C structs. I was thinking that it would be done mostly in the compiler. The compiler would convert:
class C:
def meth(self):
self.arg = ...
into
class C:
def meth(self):
self.arg = ...
__expected_attributes__ = ("arg", ...)
We would also need some code in PyType_Ready()
to consume the __expected_attributes__
and deduce the correct size limits for the shared keys.
We use the following heuristic: Start with 30 items and reduce by one each time we create an object until the size of the array is no more than one greater than the number of keys in the shared keys.
As discussed offline, we need to redesign this heuristic. The number of shared keys after an object is created is no longer assumed to be sufficient - we want to leave enough room for the expected attributes.
We now have
__expected_attributes__
Apparently, adding a new attribute to classes breaks several real-world packages that do some kind of introspection -- at least zope.interface
and typing_extensions
.
Could you add an entry about this change to What's New?
For 3.13 we are going to use the __static_attributes__
as a template for the shared keys. It is not ideal as it doesn't account for superclasses, but it better than nothing (which is what we have now). https://github.com/python/cpython/pull/118468
For 3.14 we can be a bit more sophisticated.
Feature or enhancement
With https://github.com/python/cpython/pull/28802 we only create dictionaries when needed, which means we need to guess how big to create the values array attached to each object.
We use the following heuristic: Start with 30 items and reduce by one each time we create an object until the size of the array is no more than one greater than the number of keys in the shared keys. This works reasonably well, but it could be improved. Many classes have a fixed set of attributes, but they are not all used during early object creation. Static analysis could give us a better estimate of the size of values array to use, by pre-initializing the shared keys.
For each class body we should have the compiler inject a special attribute, say
__expected_attributes__
, which can be used at runtime to compute the expected attributes for the class.For an example of code where static analysis could work, but our current dynamic approach does not, see https://github.com/python/pyperformance/blob/main/pyperformance/data-files/benchmarks/bm_go/run_benchmark.py
Linked PRs