Closed rlamy closed 3 years ago
Maybe you could tie the pypy download to the latest nightly instead of the 7.3.3 release
Current status:
- Broken on PyPy. The following legacy_slots need to be implemented: Py_tp_dealloc, Py_tp_getset, Py_tp_alloc, Py_tp_new, Py_tp_free
I'm confused: currently tp_alloc
& co. are commented out from public_api.h
: how is it possible that it works with the universal ABI if you can't use them in HPy?
https://github.com/hpyproject/hpy/blob/fdc6047e6db755ecaeded1f805b91823eb11f08d/hpy/tools/autogen/public_api.h#L387-L392
Moreover, I'm not even sure if we can support the very same semantics as tp_alloc
, tp_dealloc
and tp_free
directly, or if we need to find something which is more GC friendly. I need to think.
I'm confused: currently
tp_alloc
& co. are commented out frompublic_api.h
: how is it possible that it works with the universal ABI if you can't use them in HPy?
These are CPython slots, not HPy slots. They are trivially supported on CPython, with both ABIs. On PyPy, however, it's a lot more difficult. I think that the only way to support them is to fall back to cpyext and create a W_PyCTypeObject
instead of a W_HPyTypeObject
, which isn't very appealing.
I think tp_getset
and tp_new
are different from tp_alloc
, tp_free
, tp_dealloc
since the first two have python dunder methods, no? The others are more tied to the implementation.
I think
tp_getset
andtp_new
are different fromtp_alloc
,tp_free
,tp_dealloc
since the first two have python dunder methods, no? The others are more tied to the implementation.
tp_getset
could "easily" be supported. The problem with tp_new
is that it returns a PyObject*
and how do we turn that into a W_HPyObject
?
The problem with
tp_new
is that it returns aPyObject*
and how do we turn that into aW_HPyObject
?
I think tp_new
is outside the scope of what can be done with HPy. Not all the C-API can be converted to opaque handles.
I'm confused again. We already support HPy_tp_new
, and it's implemented also by PyPy. See e.g. test_hpytype.py:test_HPy_new
.
The problem with tp_new is that it returns a PyObject* and how do we turn that into a W_HPyObject?
well, we already have machinery to convert HPy <==> PyObject*
, we use it to implement HPy_AsPyObject
, HPy_FromPyObject
and in general all the legacy_*
features. I fail to see what is the problem, but I might be overlooking something.
Also, keep on mind that using .legacy_slots = {tp_new, ...}
is temporary, of course. Eventually, it should become a .slots = {HPy_tp_new, ...}
tp_alloc
and tp_free
are an open question: the current plan was to NOT support them: the idea is that the allocation/deallocation of memory should belong to the interpreter, to allow PyPy to allocate HPy instances in GC-managed memory, which should bring a big speed bump compared to malloc and free.
However, by doing so we prevent some legit optimizations on CPython: what if I e.g. want to use a free list for my instances?
One option could be to introduce a slot such as HPy_tp_alloc_maybe
, and then each implementation is allowed to decide whether to use it or not. Ideally, we should find some concrete cases in which tp_alloc
is used in the wild and check whether we want/can/should support those. Probably we should open a proper API design issue to discuss it.
tp_dealloc
is a closed issue: we decided that we do NOT want to support it. Instead, we introduced HPy_tp_destroy
, which is the moral equivalent of PyPy lightweight destructors: the key feature of HPy_tp_destroy
is that it does not receive an HPyContext
: this way we are sure that you can't call arbitrary python code from the destroyer, and the only thing you can do is to cleanup your C-level fields. Incidentally, this allows PyPy to implement it using lightweight destructors, see _hpy_universal/interp_type.py:W_HPyObject.__del__
.
I'm confused again. We already support
HPy_tp_new
, and it's implemented also by PyPy. See e.g.test_hpytype.py:test_HPy_new
.
That's because you're thinking of HPy slots, again ;-) I'm only discussing legacy CPython slots here.
The problem with tp_new is that it returns a PyObject* and how do we turn that into a W_HPyObject?
well, we already have machinery to convert
HPy <==> PyObject*
, we use it to implementHPy_AsPyObject
,HPy_FromPyObject
and in general all thelegacy_*
features. I fail to see what is the problem, but I might be overlooking something.
The problem is that we'd have an HPy type whose instances are cpyext objects, which is likely to cause some headaches.
In any case, there is a more fundamental issue. In the CPython API, tp_alloc
and tp_free
are C function pointers that are only called from C code. tp_new
and tp_dealloc
implement, respectively, the dunder methods __new__()
and __del__()
. They do call tp_alloc
/tp_free
at the C-level. However, on pypy, HPyType_FromSpec()
doesn't allocate a PyTypeObject, so trying to call PyArray_Type.tp_alloc
will segfault.
Also, keep on mind that using
.legacy_slots = {tp_new, ...}
is temporary, of course. Eventually, it should become a.slots = {HPy_tp_new, ...}
In numpy's case, the temporary solution is likely to last for a long time. It would be rather sad if numpy had to stop supporting pypy because of HPy! But I guess it's fine if we say that tp_new et al. must be converted to be supported on pypy.
The problem is that we'd have an HPy type whose instances are cpyext objects, which is likely to cause some headaches.
yes, but that's something we surely need to solve, probably sooner than later. Some relevant discussions:
pypy/module/_hpy_universal/test/support.py:test_legacy_slots_getsets
: this is a test which we need to skip because AsPyObject
doesn't work properly at the moment<antocuni> Hodgestar: I reviewed MR 783. I don't understand what you mean in your last sentence w.r.t issue 83, though
<Hodgestar> antocuni: Thanks!
<Hodgestar> antocuni: Re issue 83: The comment in PyPy on legacy_getset not being implemented implies that implementing it relies on hpy #83 being solved, but I'm not sure that's actually the case? If it is, I don't yet understand the link between the two.
<antocuni> I can't find the comment :)
<antocuni> ah, found it
<antocuni> IIRC, the problem is that currently HPy_AsPyObject is "broken" if you call it on an hpy instance
<antocuni> what happens is the following
<antocuni> 1. you create an HPy type, which is an W_HPyTypeObject
<antocuni> 2. you instantiate it and get a W_HPyObject
<antocuni> 2a. the W_HPyObject has a pointer hpy_data, which points to the user-defined C struct
<antocuni> 3. on CPython, when you call HPy_AsPyObject on such an object, you get a pointer to the C struct
<antocuni> 3a. so for example, if you are implementing a type point, on CPython you get a "struct Point*" on which you can do "p->x" and "p->y"
<antocuni> 4. on PyPy you get something completely different: you get the cpyext proxy for a generic W_Root: so, if you cast the PyObject* to "struct Point*" and do "p->x", things will explode (and/or, you will read nonsense)
<antocuni> now, look at test_legacy_slot_getsets
<antocuni> you see that it does exactly that: it receives a PyObject* point but it assumes it's castable to PointObject*
<Hodgestar> Ah, got it.
<antocuni> to go forward, we surely need to fix issue 83 kind of soon, but it's not so easy of course
To have any chance to implement any legacy_*
stuff on PyPy, we need a two way HPy <==> PyObject*
. I fear that the only reasonable way to implement is to declare that if an HPy type uses some tricky legacy_*
stuff, it automatically becomes a cpyext type, and its instances are W_BaseCPyObject
.
Then we need a way to implement HPy_FromPyObject
: this can be done by adding special-cases here and there inside _hpy_universal
, or maybe by introducing a base class which is a common ancestor of both W_HPyObject
and W_BaseCPyObject
.
What are the "tricky legacy_*
stuff"? I don't know for sure. We could say that as soon as you use any legacy_*
, your HPy type becomes a cpyext type, or we could try to be smarter and do it only if really needed (e.g., if you only use legacy_methods
it might not be needed). It's worth noting that hpy/cpyext types will be slower than pure hpy types in PyPy, but this is maybe a good motivation to complete the porting :)
In any case, there is a more fundamental issue. In the CPython API, tp_alloc and tp_free are C function pointers that are only called from C code. tp_new and tp_dealloc implement, respectively, the dunder methods new() and del(). They do call tp_alloc/tp_free at the C-level. However, on pypy, HPyType_FromSpec() doesn't allocate a PyTypeObject, so trying to call PyArray_Type.tp_alloc will segfault.
well, with my proposed solution above, this problem will be automatically solved I think.
Also, keep on mind that using .legacy_slots = {tp_new, ...} is temporary, of course. Eventually, it should become a .slots = {HPy_tp_new, ...}
In numpy's case, the temporary solution is likely to last for a long time. It would be rather sad if numpy had to stop supporting pypy because of HPy! But I guess it's fine if we say that tp_new et al. must be converted to be supported on pypy.
why do you say so? Is there any fundamental issue which prevents ndarray.__new__
to be implemented in HPy?
Yes, I agree that falling back to a cpyext type is the only solution to make it work on pypy. I guess the exact rules will be determined by the outcome of hpyproject/hpy#156.
PyArray_Type
is now defined usingHPyType_FromSpec()
, which is a straightforward change. The difficulty was more in hacking the build and CI systems to support HPy.Note that there are a few issues with this:
It doesn't compile on PyPy (because it lacks_HPyGetContext()
ATM)tp_weaklistoffset
.