Open markshannon opened 1 year ago
Sounds promising! It'd be a big chunk of work though.
You won't be surprised to hear that I don't think it can go in without some support for legacy “mystery layouts”, and structs like PyListObject
will probably still need to be exposed as structs even if they use the new layout (it's in the tutorial, people use it).
You might want to separate adding the new API and deprecating/removing the old stuff, unless you want to somehow switch in a single release.
There might be some overlap between “legacy mystery layout” and “Primary”. Perhaps it would work to allow multiple Primaries in the MRO, with single inheritance only, if the subclass has “intimate knowledge” of the base's primary struct? With that, this would be pretty much backwards-compatible (though it wouldn't help that much with changes to the core object fields, and I suspect that's part of your motivation).
The proposal looks essentially compatible with PEP 697:
PyObject_GetTypeData
would be straightforward, using PyObject_GetStructOffset
PyType_GetTypeDataSize
might not be straightforward, but should be stored/computable, and will probably be useful on its ownPy_TPFLAGS_ITEMS_AT_END
would be unused, or set automatically if that would be usefulPy_RELATIVE_OFFSET
would work nicelyPyObject_GetItemData
could work with the new layout, rather than just with Py_TPFLAGS_ITEMS_AT_END
-- perhaps the VM would need a bit more info, or even manage “size & items” (struct tuple_data
), while allowing non-managed variable-sized types like int
tp_basicsize
would add a struct to the appropriate place.The offset of the start of primary structs is an unstable API constant
Could we have a PyObject_GetPrimaryStruct()
function, and treat PyUnstable_PRIMARY_STRUCT_OFFSET
as a lower-level thing? Let's abstract the raw pointer addition away where we can.
HasLength: The objects have length, e.g. list, tuple VariableSized: The struct is variable sized, e.g. tuple (but not dict or list). Implies
HasLength
I read that as:
HasLength
enables Py_SIZE
.VariableSized
enables PyObject_NewVar
, and implies that Py_SIZE
stores the lengthConsidering a recent Discourse topic, we might want:
HasLength
enables Py_SIZE
.VariableSized | HasLength
enables PyObject_NewVar
, and Py_SIZE
(backed by a VM-managed field) stores the length.VariableSized
only enables PyObject_NewVar
, but the user manages the size field (letting them e.g. pack additional bits into it).You won't be surprised to hear that I don't think it can go in without some support for legacy “mystery layouts”,
Of course, everything we do needs to support mystery legacy [insert feature here]. It's just one bit of information, so shouldn't be a problem.
There might be some overlap between “legacy mystery layout” and “Primary”.
Maybe, although I'm inclined to just treat "legacy mystery layout” as a black box to start with.
Could we have a PyObject_GetPrimaryStruct() function, and treat PyUnstable_PRIMARY_STRUCT_OFFSET as a lower-level thing?
Sure, we can make PyUnstable_PRIMARY_STRUCT_OFFSET
part of the "unlimited" API, and add PyObject_GetPrimaryStruct()
to the limited API.
The proposal looks essentially compatible with PEP 697
I think there is a difference when subclassing a class of variable sized objects, V
(eg. tuple
) and another non-primary, non-variable class, C
In PEP 697, the data for C
goes after the data for V
, which means that the offset is a function of the MRO and the object.
In this proposal, the data for C
goes before the object header, so is a function of just the MRO, meaning that it's fields can be accessed as fast any other slot (assuming an adaptive optimizer).
I read that as: HasLength enables Py_SIZE.
It is functionally equivalent, yes.
But Py_SIZE
is broken, it makes an unchecked cast and hopes for the best. We can do better.
int
is variable sized, but Py_SIZE
gives the wrong value.
VariableSized enables PyObject_NewVar, and implies that Py_SIZE stores the length
No, because we can have a primary class and a variable sized class in an MRO, which means that variable sized data will not be in the primary position.
Regarding the size of variable sized structs:
We can reasonably restrict the size to confirm to the following equation: base_size + item_count * item_size
, where base_size
and item_size
are fixed when the class is created.
However, we need to allow classes to compute item_count
, as they might not store it directly.
So we need another flag, say HasManagedLength
which delegates the storage of item_count
to the VM for most classes that don't do anything funny with the item_count
.
We also need to allow classes to provide their own get_item_count()
functions.
In PEP 697, the data for C goes after the data for V, which means that the offset is a function of the MRO and the object.
PEP 697 primarily specifies an API. It also, necessarily, describes how the API will work with (a subset of) the current class layouts. The API can work with the layout proposed here.
Maybe, although I'm inclined to just treat "legacy mystery layout” as a black box to start with.
Maybe you can do that, but you can't ignore it entirely. How does inheritance work between old black boxes and the new layout?
But Py_SIZE is broken, it makes an unchecked cast and hopes for the best. We can do better.
So, what happens to the Py_SIZE
API? It's removed because it's not correct in some cases?
[edit: you just answered the following]
No, because we can have a primary class and a variable sized class in an MRO, which means that variable sized data will not be in the primary position.
There are two use cases:
- Python knows where and how the size data is stored, so it can e.g. provide a correct default
__sizeof__
- Only the class knows where and how the size data is stored, so it can store it in creative ways
Does this proposal enable one of these, or both?
What exactly does HasLength
mean?
Hello, I became aware of this discussion through https://discuss.python.org/t/24136 and would like to chime in on one aspect. Please excuse me if I have overseen other important aspects.
@markshannon wrote:
* HasLength: The objects have length, e.g. `list`, `tuple` * VariableSized: The struct is variable sized, e.g. `tuple` (but not `dict` or `list`). Implies `HasLength`
Please note that having a length in the sense of __len__
and being a variable sized PyObject
in the sense of PyVarObject
are two entirely independent concepts.
The best example is the built-in PyLong
which has no length in the sense of __len__
at all, but is still variable length. What's more, it doesn't store that length in the ob_size
field of PyVarObject
. (Instead it uses the sign that field to store the sign of the underlying number, and the absolute value to store the length.)
For other examples, consider a type representing an (immutable) tree, or a small multidimensional array: both can be usefully represented as variable size PyObjects
that either have no __len__
or a __len__
that is unrelated to the size of their data payload.
That's why in a redesign of python object layout it may be worthwhile to consider getting rid of any equivalent of today's Py_SIZE
and Py_SET_SIZE
. It seems that these have very little use (providing a default implementation of __sizeof__
?), and are actually "abused" even by built-in types to store payload.
Declaring fields If we want to expose a C field to the VM, we need to declare its type, and offset. In addition we want to declare various attributes, much as are already declared in
PyMemberDef
One additional capability we would like to add, that PyMemberDef doesn't have, is to have fields that can be directly read, but have a function setter, which suggests the following struct:struct PyAttributeDef { const char *name; const char *doc; uint32_t offset; int16_t flags; uint8_t type; getter get; setter set; };
This effectively merges PyMemberDef
and PyGetSetDef
. Yay for deduplication!
What's missing is a void*
arg that's passed to the getter/setter.
Instead of that we could always pass a pointer to the declaring class (edit: and the PyAttributeDef
), and make it easy to store C data on a class object.
What's missing is a
void*
arg that's passed to the getter/setter. Instead of that we could always pass a pointer to the declaring class, and make it easy to store C data on a class object.
Could you remind us of the use case for this?
Telling multiple similar getters/setters apart. It's general good practice to add such a context for callbacks. Here it looks like it'd be handy when the class is auto-generated -- you don't want to generate JIT-style parametrized functions. But yes, I'd need to check if it's actually used in the wild.
See also: making struct _typeobject
a plain C struct -- https://github.com/faster-cpython/ideas/issues/672
Could the "legacy mystery layout" of PyDictKeysObject
, which has two variable-sized items/arrays (indices, and entries), be supported?
We stated above that if the "Grand Unified Python Object Layout" could not support classes like tuple or dict, it would not be reasonable to expect C extensions to use it.
I find the design of dict
(/the underlying table) really nice, but it's not obvious to me how it would be implemented using this API. It feels like a general solution would need to do a lot of back-bending to accommodate this kind of layout
This is likely to need to end up as PEP, but I want to sketch out the idea here first.
General idea
All Python objects should share a common layout that makes the layout visible to the VM, and provides the performance needed by C extensions. In order to ensure that it is fast enough we should use it for internal classes, like tuples, lists and dicts. If it is fast enough for those classes, it will be good enough for NumPy and Cython. Persuading them to use a new API will be another issue, but we need to provide the capability first.
Requirements
Performance
Important classes ("important" must be user defined, or it won't get used) should be able to get at least the speed of manually defined C code. Because the VM will have visibility into the classes, we should be able to get better performance in some cases.
Composability
Most classes should be freely composable, including through multiple inheritance.
Flexibility
The design shouldn't put unnecessary constraints on either the VM implementation or the C extension code.
Conceptual interface
One layout per class
The layout of a Python object depends on its class. All instances of a class have the same layout, from the point of view of C extensions (the VM may use "object shapes" or some other optimization resulting in differing physical layout in a way that is not visible to C extensions.
C extensions define a C struct and requirements, not layout
When defining a C extension type, the C struct will be described to the VM in terms of size and alignment, and the details of accessible fields in it, much like PEP 697, but without the control over layout that PEP 697 gives. Layout is up to the VM.
Layouts aren't inherited
If class C inherits from class B, its layout is not inherited. In other words if class
B
defines a structS
, the offset ofS
within an objectC()
may differ from the offset ofS
within an objectB()
.Extension class properties
Each extension defined class can have the following properties, which are not inherited.
list
,tuple
tuple
(but notdict
orlist
). ImpliesHasLength
No class can have more than one class in its MRO (including itself) which each of the above property.
The C-API.
To get the offset of a struct:
intptr_t PyObject_GetStructOffset(PyTypeObject *self_class, PyTypeObject *declaring_class);
will return the offset, in bytes, from the owningPyObject *
to the requires struct. The offset may be negative, a return value of -1 indicates an error.PyObject_GetStructOffset
is a pure function of the triple(sys.implementation, self_class, declaring_class)
meaning that the offset may be cached globally during execution but should not be cached in source code or to disk.The offset of the start of primary structs is an unstable API constant
const intptr_t PyUnstable_PRIMARY_STRUCT_OFFSET
is a compile-time constant (which I expected to always be positive, but lets leave that open for now)This allows "primary" classes to get performance on a par with the current implementation of, for example,
PyTupleObject
.Declaring fields.
If we want to expose a C field to the VM, we need to declare its type, and offset. In addition we want to declare various attributes, much as are already declared in
PyMemberDef
One additional capability we would like to add, thatPyMemberDef
doesn't have, is to have fields that can be directly read, but have a function setter, which suggests the following struct:If
get
is NULL, it implies the field can be read directly, unless theWRITE_ONLY
flag is set. Ifset
is NULL, it implies the field can be written directly, unless theREAD_ONLY
flag is set.A possible implementation.
Objects are laid as follows:
__dict__
/values,__weakrefs__
, etc.)The object pointer (PyObject *) points into the middle of the the core object, as it has done since the cycle GC was introduced. The offset from the object pointer to the end of the core object will be a multiple of the maximum C alignment and will be
PyUnstable_PRIMARY_STRUCT_OFFSET
.Example.
We stated above that if the "Grand Unified Python Object Layout" could not support classes like
tuple
ordict
, it would not be reasonable to expect C extensions to use it.We would define tuple with the following struct:
And declare that
tuple
is a primary, variable-sized type.We can then define the accessor functions efficiently as:
The length property could be defined as follows (we wouldn't do this in practice as
().length
isn't a thing)