python / cpython

The Python programming language
https://www.python.org
Other
62.28k stars 29.92k forks source link

PyType_FromSpec API fails to use metaclass of bases #89546

Closed 7227ac6e-e071-42b0-9479-676fdb34fe4f closed 2 years ago

7227ac6e-e071-42b0-9479-676fdb34fe4f commented 2 years ago
BPO 45383
Nosy @encukou, @seberg
PRs
  • python/cpython#28748
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['interpreter-core', 'expert-C-API', 'type-crash', '3.11'] title = 'PyType_FromSpec API fails to use metaclass of bases' updated_at = user = 'https://github.com/seberg' ``` bugs.python.org fields: ```python activity = actor = 'seberg' assignee = 'none' closed = False closed_date = None closer = None components = ['Interpreter Core', 'C API'] creation = creator = 'seberg' dependencies = [] files = [] hgrepos = [] issue_num = 45383 keywords = [] message_count = 9.0 messages = ['403273', '408532', '408720', '408723', '408790', '408791', '408793', '408797', '408801'] nosy_count = 2.0 nosy_names = ['petr.viktorin', 'seberg'] pr_nums = ['28748'] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'crash' url = 'https://bugs.python.org/issue45383' versions = ['Python 3.11'] ```

    7227ac6e-e071-42b0-9479-676fdb34fe4f commented 2 years ago

    The PyType_FromSpec fails to take care about MetaClasses.

     https://bugs.python.org/issue15870

    Asks to create a new API to pass in the MetaClass. This issue is only about "inheriting" the metaclass of the bases correctly. Currently, Python fails to take into account that the bases may be MetaClass and not PyType_Type instances.

    encukou commented 2 years ago

    I haven't forgotten this issue, but when I get to it it always leads to a rabbit hole. Sometimes just chasing refleaks, but there are deeper issues as well.

    AFAICS, there's no way to call metatype.tp_new for such a class. I guess the safest option is to fail when the metaclass has a custom tp_new -- because that means the metaclass is requesting that it wants to allocate/initialize its types itself.

    We can call metatype.tpinit, and probably should. And \_init_subclass__ too, I suppose.

    But at that point, this is duplicating a lot of existing functionality, and I'm starting to wonder if this wouldn't all be better with calling the metaclass instead. What's missing?

    Does that seem like a more reasonable direction to explore?

    7227ac6e-e071-42b0-9479-676fdb34fe4f commented 2 years ago

    Sorry, I need some time to dive back into this, so some things might be garbled :). Yes, I do agree supporting a custom tp_new here seems incredibly tricky. I have not thought about the implications of this, though.

    guess the safest option is to fail when the metaclass has a custom tp_new

    That would seem OK, but I can't quite judge it. It may mean that I have to do a custom factory to create new metaclass instances from Python, but that is probably for the better anyway.

    Now, calling tp_new is a bit useless, since from C we don't have a dict anyway (at least not really). So yeah, this does not make sense at all for most Python defined metaclasses... (they may want to inspect/mutate the dict)

    But at that point, this is duplicating a lot of existing functionality, and I'm starting to wonder if this wouldn't all be better with calling the metaclass instead.

    I do not think I am following :(. My worry is that I want people to create a MetaClass instance through C (but not be locked into static types forever).

    My current thought is that I would like it be possible to do something like:

        /* Create a new type object */
        type_spec = {stuff};
        newtype = PyType_FromSpec(&type_spec);
        /* Finalize the MetaClass */
        metatype_spec = {more_stuff};
        PyArray_InitDTypeFromSpec(newtype, &metatype_spec);

    Of course I can move this into a single function and create the class for the user. But I am uncertain that piping everything through tp_new will help? At some point I thought that the user should create a subclass, and I create another class inheriting it. But it also seems convoluted and complicated.

    I have no idea right now, but maybe there could also be a way to make creating a metaclass-factory in C easier, rather than supporting PyType_FromSpec for metaclasses. (I.e. an PyType_InitFromSpec() doing most of what PyType_FromSpec does, but really meant to be only used be such metaclass factories.)

    • basicsize/itemsize could be allowed with __basicsize/__itemsize in the dict.

    Do we need this? I need the basicsize of the metaclass, but that of the class seems fixed/OK?

    • flags: we could add mechanisms to set individual flags after the type is created, as needed.

    Seems fine, yeah.

    • slots can usually be applied after the class is created; maybe there should be a public function for this.
    • members could theoretically be copied to individual descriptors; there doesn't seem much need for keeping tp_members around.

    But a Python MetaClass (that the author may not even realize about) might need access to these to work properly? A bit far fetched, but...

    7227ac6e-e071-42b0-9479-676fdb34fe4f commented 2 years ago

    It is probably early, but I am starting to like the idea of a "C MetaClass factory" helper/indicator.

    It seems to me that yes, at least tp_new cannot be called reasonable for a class that is created in C, it is just too confusing/awkward to try to push the C stuff through the Python API. (And the Python API is the typical one that should not be inconvenienced)

    Which gives two possibilities if I want this capability?:

    1. Force use of a custom class factory in Python (i.e. not through __new__). But that seems hard, since I need to refuse __new__()!).
    2. Use a class factor in C which never calls __new__ and knows that this is OK.

    This is not my turf, so I like unholy, but maybe pragmatic things :). Would a slot/flag indicating Py_using_metaclass_cinit_pretty_promise_please "solve" these issues?

    I mean, we have two ways to create classes (C and Python), I am not sure it is plausible to untangle this on the MetaClass level, so maybe the only way forward is to embrace it: Some Python MetaClasses just can't be instantiated from C, because we don't know it will work. If we want to allow instantiating the MetaClass from C, we need some way to set this up. And either we cannot use PyType_FromSpec then, or we need to tell it that we know what we are doing.

    encukou commented 2 years ago

    I don't see how instantiating a metaclass with non-default tp_new would work if you don't know some details about the specific metaclass. So IMO we can we limit ourselves to scenarios where either: 1) the metaclass uses default tp_new, or 2) the code that creates the class knows about the metaclass

    For case 2), we could leave allocation to the calling code, and add an API function that does the rest of the work of applying the spec. Something like:

        /* Create a metaclass */
        metatype_spec = {stuff};
        metatype = PyType_FromSpecAndBases(&metatype_spec, &PyType_Type);
        /* Create a type */
        type_spec = {other_stuff};
        newtype = alloc_metatype();
        PyType_ApplySpec(newtype, &type_spec);

    PyType_ApplySpec would assert PyType_Ready wasn't called before, and call it after filling in the name, slots, etc. The metatype could disable its tp_new to disallow PyTypeFromSpec (and Python \_new__), effectively enforcing "using_metaclass_cinit_pretty_promise_please".

    There could be a slot for code to run at the end of PyType_ApplySpec -- the "PyArray_InitDTypeFromSpec" in your pseudocode.

    That seems better than calling the metaclass, but to answer questions on that idea:

    > - basicsize/itemsize could be allowed with __basicsize/__itemsize in the dict. Do we need this? I need the basicsize of the metaclass, but that of the class seems fixed/OK?

    That's the size of the instance. One more level down :)

    > - members could theoretically be copied to individual descriptors; there doesn't seem much need for keeping tp_members around. But a Python MetaClass (that the author may not even realize about) might need access to these to work properly? A bit far fetched, but...

    Seems very far-fetched to me. IMO these, like e.g. tpmethods, should only be used as arguments for the code that puts the descriptors in \_dict__. How the descriptors got there is a detail, the class doesn't need to use tp_members (nor advertise that it does).

    7227ac6e-e071-42b0-9479-676fdb34fe4f commented 2 years ago

    Fully, agree! In the end, PyType_FromSpec replaces type.__new__() (and init I guess) when working in C. In Python, we would call type.__new__ (maybe via super) from the metatype.__new__, but right now, in C, the metatype cannot reliably use PyType_FromSpec in its own metatype.__new__ to do the same.

    I agree with the scenarios:

    PyType_ApplySpec would provide that way to create a custom metatype.__new__ in C when PyType_FromSpec() would otherwise reject it to make the first scenario safe. A flag probably can do the same. I have no preference, ApplySpec seems great to me.

    encukou commented 2 years ago

    Nice! It's starting to look reasonable, I'll try an implementation when I get some focus time. (Sadly I can't promise it'll be this year.)

    Just one detail:

    A flag probably can do the same. I have no preference, ApplySpec seems great to me.

    I didn't get what you mean here.

    7227ac6e-e071-42b0-9479-676fdb34fe4f commented 2 years ago

    Well, what we need is a way to say: I am calling type.__new__ (i.e. PyType_FromSpec) on purpose from (effectively) my own mytype.__new__?

    That is, because right now I assume we want to protect users from calling PyType_FromSpec thinking that it is equivalent to calling class new(base) when it may not be if base is a metaclass. So calling PyType_FromSpec might refuse to work if it finds a custom metaclass.__new__ (init?).

    I don't really see that it matters if we only support effectively this from C:

    class MyMetaClass(type):
        def __new__(cls, *args, **kwargs):
            self = type.__new__(...)  # this is PyType_FromSpec
            # more stuff

    So, I thought telling PyType_FromSpec that we are "inside" a custom __new__ is sufficient and that even as a flag passed as part of the spec could be enough. But... I agree that I do not quite see that it would be pretty, so it probably was a bad idea :).

    Plus, if you add a new method it should also solves the issue of setting the tp_type slot to the metaclass explicitly when it is not implicit by inheritance (which is the only thing I care about). (PyType_FromSpec and PyType_ApplySpec will still need to do the work of resolving the metaclass from the base classes, though.)

    7227ac6e-e071-42b0-9479-676fdb34fe4f commented 2 years ago

    Btw. huge thanks for looking into this! Let me know if I can try to help out (I can make due with static metatypes, but my story seems much clearer if I could say: Well with Py 3.11 you can, and probably should, do it dynamically.). I had lost a lot of time chasing "metaclass should just work" followed by "but I can't get it right without bad hacks".

    And now the solution seems fairly clear, which is amazing :)!

    encukou commented 2 years ago

    After PyType_From* functions are enhanced in #93012 and #28748, I'd like to take a moment to clean it up:

    Some of this is already entangled in #28748, but still, a separate cleanup PR is best practice.

    seberg commented 2 years ago

    Thanks for fixing up the PR and pushing this through Petr! Sounds like this can be closed then :).

    encukou commented 2 years ago

    Not yet, I'm preparing a clean-up PR (which I plan to send after #93471 to avoid conflicts).