hpyproject / numpy-hpy

The fundamental package for scientific computing with Python.
https://numpy.org
Other
20 stars 1 forks source link

Are Numpy API functions for creating legacy dtypes still necessary? #14

Open fangerer opened 1 year ago

fangerer commented 1 year ago

I currently try to get rid of HPy_SetType in branch graal-team/hpy . We used it (as a replacement for Py_SET_TYPE) in function dtypemeta_wrap_legacy_descriptor. I was able to refactor the creation of (_hpy)_builtin_descrs (see https://github.com/hpyproject/numpy-hpy/blob/graal-team/hpy/numpy/core/src/multiarray/arraytypes.c.src#L4842). I did so by first creating the new dtype and then instantiating the singleton. However, it is not that simple for Numpy API (H)PyArray_RegisterDataType because the descriptor to register is provided as argument (e.g. sometimes created with (H)PyArray_DescrNew).

My question is: are (H)PyArray_DescrNew and (H)PyArray_RegisterDataType legacy APIs that have some replacement and could we just drop them in Numpy/HPy? Or is there some other way to solve this?

mattip commented 1 year ago

@seberg thoughts? As far as I remember, those interfaces are user-facing, even after the dtype refactor.

mattip commented 1 year ago

Is the need for HPy_SetType removed in python3.12 with the new PyType_FromMetaclass?

seberg commented 1 year ago

They are something in between. I would expect just failing for them is fine mostly, except for user defined dtypes (which are not super large as a user-base) in the case of PyArray_RegisterDataType. PyArray_DescrNew is pretty much only used for weirder operations on string dtypes, which may be used in the wild, but maybe not so commonly that you need to focus on it.

That said, PyArray_DescrNew seems fixable to me? Just create the new object and copy all fields over? That is effectively what it does anyway. (For new user DTypes it would be nice to provide a way to get a copy since their basicsize may not match and in that case we cannot do the copy for them.).

For PyArray_RegisterDataType it is only fixable by changing the API a bit to create a new object instance; that way you don't have a statically allocated instance. I had some plans to do some time after branching 1.25. (it requires a C-API break)

Is the need for HPy_SetType removed in python3.12 with the new PyType_FromMetaclass?

I would think/hope this helps for DType classes, but I guess we may need to adapt NumPy for that?

fangerer commented 1 year ago

Thanks @seberg for the explanation and suggestions.

That said, PyArray_DescrNew seems fixable to me? Just create the new object and copy all fields over?

Yes, sounds good. I'll do that.

For PyArray_RegisterDataType it is only fixable by changing the API a bit to create a new object instance

Agreed. I will change the signature. I do also not see how we could patch an existing legacy dtype in a proper HPy way.

seberg commented 1 year ago

Agreed. I will change the signature. I do also not see how we could patch an existing legacy dtype in a proper HPy way.

I think it should work, but you need to create the instance for the user (and throw away what they gave). Right now, you could try to do that (copy everything) but update the type_num at least (and keep pointers pointing to the original), that might keep a lot of things working. That said, I think its fine to tag on an error and say "maybe after its fixed in numpy" (which is blocked for a month due to requiring API changes).