Closed neonene closed 2 months ago
I just want to say that I like the overall approach :-)
Ditto; I also like the approach. (It's been discussed over at Petr's fork earlier.)
I've reviewed the implementation, focusing mainly on docs, and sent suggestions as a pull request to neonene's PR: https://github.com/neonene/cpython/pull/1
Some considerations:
Py_tp_token
slot corresponds to ht_token
, so by current convention it might be named Py_ht_token
(ht_
, not tp_
). However, the heap type struct is an internal detail, so the public API uses tp_
name and we say that it doesn't map to a struct field.PyType_GetBaseByToken
takes PyTypeObject
, it should still type-check and raise TypeError
, as if it was a PyObject*
. (In C, casting is too easy.)Thank you for the reviews. I've updated the draft PR (and this proposal a bit), based on the suggestions.
I guess it's time for the vote. API summary:
int PyType_GetBaseByToken(PyTypeObject *type, void *token, PyTypeObject **result);
#define Py_tp_token <...>
#define Py_TP_USE_SPEC NULL
LGTM. Everything I wish was different about the proposal is just to be more consistent with things we already have, but none of them would work, so it's really just wishing that the others were more like this one! (GetModuleByDef, particularly)
@mdboom and @serhiy-storchaka, what are your opinions?
The defaultdict
's union (|
) could be an example of this proposal.
@mdboom and @serhiy-storchaka, are you OK with this API?
Clever idea!
Here are the current performance results taken from https://github.com/python/cpython/pull/121079#issuecomment-2311838232.
Repeat n-times in a C function (Windows PGO, the higher, the slower):
find super-class A | n = 1 | n = 10 | n = 1 | n = 10 |
---|---|---|---|---|
subclass A | ||||
GetBaseByToken * n |
(base) | (base) | ||
GetSlot * n |
1.00x | 1.01x | ||
GetModuleByDef + CheckExact * n |
1.01x | 0.92x | ||
(GetModuleByDef + CheckExact ) * n |
1.00x | 1.06x | ||
IsSubtype * n |
0.99x | 0.97x | ||
subclass C | ||||
GetBaseByToken * n |
1.00x | 1.13x | (base) | (base) |
GetModuleByDef + IsSubtype * n |
1.02x | 1.08x | 1.04x | 0.97x |
(GetModuleByDef + IsSubtype ) * n |
1.02x | 1.34x | 1.04x | 1.21x |
IsSubtype * n |
1.01x | 1.03x | 0.99x | 0.95x |
I now feel that setting the *result to a new reference makes the type check non-trivial for the C compiler and me. I would prefer it to be a borrowed reference when checking whether the found type is the given type itself or not. The API-consistency may wins, but I'm not convinced the superclass should be incref'ed by this function rather than by a user as needed, which I think is kept alive in the subclass's tp_bases
. Under the API design, I would use Py_DECREF()
/_Py_DECREF_NO_DEALLOC()
as needed right after calling PyType_GetBaseByToken()
. That actually cancels most of the incref/decref overheads.
For that, it should IMO be a function like PyType_GetBaseByToken_Borrow
in addition to the function that returns a new ref. I'd prefer not adding it to this vote.
(Since an object's class can be changed at runtime, there's a possibility that the class is deallocated after the call. It's not clear how long a borrowed reference would be valid.)
@mdboom, what's your opinion on this API?
Can I open an issue for the feature in the tracker?
Thanks for voting! Let's add the API now.
Can I open an issue for the feature in the tracker?
Please do! I'll review the PR next week.
Thank you all very much for the decision.
This is a continuation proposal of PEP-489 and later PEPs. PEP-630 notes:
One known solution is to assign a C layout ID to particular heaptypes. It will be helpful for subclass checking in tp slot methods (e.g.
nb_add
,tp_dealloc
), especially at the final phase where we cannot rely on the module state[^1].For more context, see: https://discuss.python.org/t/55598/2
Proposal
Adding a pointer member to heaptypes, then asking module authors to assign a preferable value (token) if they agree that:
For example, an extension modules that automatically wraps C++ classes could assign the
typeid
.Introducing
Py_tp_token
slot for the entry:Unlike other type slots, this slot will accept
NULL
through the new dedicatedPy_TP_USE_SPEC
identifier:{Py_tp_token, Py_TP_USE_SPEC}
The option above will instruct the
PyType_FromMetaclass
function to use itsspec
argument as a token (the slot's actual value).An absence of the slot will disable the feature.
Introducing
PyType_GetBaseByToken(type, token, ...)
helper functionIt will find a class whose token is valid and equal to the given one, from the type and superclasses.
Specification
The
PyHeapTypeObject
struct will have a new member, theht_token
void pointer (NULL
by default), which will not be inherited by subclasses.The existing
PyType_FromMetaclass(..., spec, ...)
function will do the following, when the proposed slot ID,Py_tp_token
, is detected inspec->slots
:PyType_GetSlot(type, Py_tp_token)
will returnNULL
if a static type is given.A helper function will be:
Scan only the heaptypes ~that have a non-NULL token~, walking the type's
tp_mro
if exists, or walking thetp_bases
recursively.NULL
, set an exception, return-1
.NULL
and return0
.1
.SystemError
whentoken
isNULL
.TypeError
whenPyType_Check(type)
returns false.The
result
argument acceptsNULL
not to assign a reference (check only mode).Reference implementation
https://github.com/python/cpython/pull/121079
Performance
A subclass check in a slot method currently consists of the following steps:
PyType_GetModuleByDef
(walks MRO)PyModule_GetState
Py*_CheckExact
PyType_IsSubtype
(walks MRO)PyType_GetBaseByToken
is cheaper than(1)+(2)+(3)
, but a little more expensive than4.PyType_IsSubtype
[^2]. Mostly, using the new function alone will be efficient enough except when staying in C functions and repeating(3)(4)
with a module state passed around[^3].Backwards Compatibility
ht_token
, is added to heap types.Py_tp_token
, is added with an identifier,Py_TP_USE_SPEC
.PyType_GetBaseByToken
, is added, ~whose documentation will mention the new slot above~.UPDATE:
Py_tp_token
,Py_TP_USE_SPEC
andPyType_GetBaseByToken
will be documented individually.Previous discussions
[^1]: The GC can clear the module state or can erase the references to the module from heaptypes: gh-115874
[^2]:
PyType_IsSubtype
can be slower on recent Windows PGO builds due to the unstable optimization.[^3]:
PyType_GetModuleState
is available afterPyType_GetBaseByToken
.