faster-cpython / ideas

1.67k stars 49 forks source link

Repurpose deepfreeze to deepfreeze classes, not code objects. #573

Open markshannon opened 1 year ago

markshannon commented 1 year ago

For multiple interpreters to work we need static objects that can be safely shared across interpreters. Those objects include trivial, small object like None and True which are easy to handle. However we also want to share large complex objects like int.

Classes, like int, have a dictionary which is (thankfully) immutable and contains references to strings, builtin objects, etc.

Any object must be outlived by any object that it refers to, so any object that an immortal object refers to must be immortal. Immutable objects can refer to mutable objects. But objects that can be shared, can only refer to other objects that can be shared.

Since we are making objects immutable so that they can be shared, we should treat immutability and shareability as going to together. Thus, an immutable/shareable object can only refer to other immutable/shareable objects.

Defining a class as a PyTypeObject manually is already a bit fiddly, but having to define, in C, the entire object graph of the type's dict, mro, etc. would be horrible.

Fortunately we already have a tool that walks an object graph and emits the necessary C structs: deepfreeze.

By adding the capability to traverse classes, dict and a few objects, we can have deepfreeze emit the object graph for builtin objects. The current approach of initializing and finalizing/deallocating static object is broken.

(Deep freezing code objects is dubious, as there are internally mutable, so repurposing deepfreeze would fix that problem as well).

iritkatriel commented 1 year ago

Any object must be outlived by any object that it refers to, so any object that an immortal object refers to must be immortal.

Should this be "any object that an immutable object refers to must be immutable"?

(I don't see why it is true for a mutable immortal).

carljm commented 1 year ago

Unless reference counting is broken, any object referred to by an immortal object will be in practice immortal (unless the immortal object is mutable and loses its reference), whether it is formally marked as immortal or not. So it's not required for correctness (but probably preferable for efficiency) for all objects referred to by an immortal object to be actually marked as immortal.

Of course the correctness requirements get stricter when we start talking about immutability and shareability between interpreters.

markshannon commented 1 year ago

(I don't see why it is true for a mutable immortal).

There is no point in having mutable immortal objects. The only reason for having immortal objects is to share them, but they can't be shared if they are mutable.

gvanrossum commented 1 year ago

Where would the classes come from? Where would deepfreeze be invoked during the build process?

quark-zju commented 1 year ago

Does this mean that types can be "ready" without PyType_Ready? It would be nice to reduce PyTypes_Init overhead.

markshannon commented 11 months ago

Where would deepfreeze be invoked during the build process?

We would build the object graph dynamically using _bootstrap_python, but we can extract the names using any Python.

Make should handle the build, python would depend on deepfrozen.o. deepfrozen.c would depend on deepfreeze_objects.py and deepfreeze.py. The first would specify the object graph and the second would freeze it.

The code in deepfreeze_objects.py would build the object graphs, all variable names starting "Py_" would be exported.

_bootstrap_python could include all the symbols, but they might all refer to None to avoid

Example code in deepfreeze_objects.py:

Py_str_hi = "hi"
Py_names_a_b_c = ("a", "b", "c")
gvanrossum commented 10 months ago

@markshannon Are you proposing we do this for 3.13? We should figure out who can work on it.