hpyproject / hpy

HPy: a better API for Python
https://hpyproject.org
MIT License
1.02k stars 52 forks source link

Multiphase module initialization #393

Closed steve-s closed 1 year ago

steve-s commented 1 year ago

I think the best way to review is to start from the changes in the docs/examples and tests.


Make the multiphase module initialization the only way to initialize HPy modules, abandons the previous single phase initialization.


TL;DR: how module initialization looks like with this new API:

static HPyDef *moduledefs[] = { &exec1, NULL };

HPyModuleDef mod = { .defines = moduledefs, ... }

HPy_MODINIT(my_extension_name, mod)



---
HPy multiphase module initialization is a restricted version of what CPython API has. Currently both the CPython ABI and CPython HPy universal impl implement this via CPython's multiphase module initialization, but to some extent it can be emulated on top of CPython single phase module initialization if the need is (only custom `HPy_mod_create` slot would be disallowed in such case).

`HPy_MODINIT` takes name of the extension, so that it can generate `PyInit_XXX` for CPython, and `HPyModuleDef`. There is no user provided init function anymore. The runtime initializes the module and if the user wants to populate the module, or do any other initialization, they can define one or more `HPy_mod_exec` slots that are called by the runtime right after the module is created. The slots are defined in the same way as slots for types: by using HPyDef_SLOT macro.

As a consequence, `HPy_MODINIT(myextension, hpydef)` is all that is necessary if the module only exposes module functions.

Users can also define `HPy_mod_create` slot, similar to CPython's `Py_mod_create`, but in HPy this slot can return only objects that are not of the builtin module type.

The reasoning is that builtin module objects can be created by the runtime from `HPyModuleDef` and, for the time being, there is no use-case for manually creating builtin module objects in the `HPy_mod_create` slot or anywhere else. If such a use-case emerges, support for it can be easily added to HPy. (CPython unit tests create `SimpleNamespace` in the "create" slot and I could not find any other use of that slot anywhere on the internet...)

If user specifies the `HPy_mod_create slot`, specifying anything else (setting it to a non-default value) in `HPyModuleDef` is a runtime error (likely a bug or misunderstanding on the user side).

Additionally, this PR removes `HPyModule_Create` API/ABI function. The CPython equivalent `PyModule_Create` works only for single phase module initialization. We may add equivalents of `PyModule_FromDefAndSpec` to allow manual module creation if there is a use-case. Otherwise only the runtime can create a builtin module object and only from the provided `HPyModuleDef`.

Additionally, removes the CPython semantics of negative `m_size`. HPy will add other more explicit means to communicate subinterpreters support -- I think we should just have explicit fields/flags field in `HPyModuleDef` for that, I will add it in a followup PR.
steve-s commented 1 year ago

rebased on master and fixed conflicts

mattip commented 1 year ago

I merged the documentation PR, and as predicted now this has conflicts, sorry.

fangerer commented 1 year ago

just for reference: this PR resolves #397

steve-s commented 1 year ago

Is there any way to banchmark this to see if it changes anything?

This should have an effect only on module creation/initialization, which I assumed is typically not on the hot-path, and it should not make it significantly more complex algorithmically, so it did not seem to me this needs extra performance check.

I merged the documentation PR, and as predicted now this has conflicts, sorry.

np, it's actually better. In the end, the conflicts pointed to exactly the things that I'd have to update anyway.

I've rebased the PR and updated the documentation. I tried to keep the already reviewed changes in one commit and added the new docs improvements in the new separate commit.

steve-s commented 1 year ago

Since this changes the public API in a non-backward compatible way, it should increment the major API number, right?

Yes, that's a good question. My take was that until we finish the first milestone and release it, we allow ourselves to break binary compatibility.

fangerer commented 1 year ago

IMO, we should just increase the major API version when releasing. I don't think that we need or even should give compatibility guarantees on master.

mattip commented 1 year ago

Thanks @steve-s