hpyproject / hpy

HPy: a better API for Python
https://hpyproject.org
MIT License
1.02k stars 52 forks source link

Add dictionary type to HPy #458

Closed DuToitSpies closed 7 months ago

DuToitSpies commented 7 months ago

Adds h_DictType to HPy, as this is needed for the Cython port.

fangerer commented 7 months ago

The error in tests / Check autogen does not seem to be related to the changes in that PR. I'll take a look ...

mattip commented 7 months ago

I am a bit troubled by the many shape objects. They are used to access low-level, implementation specific APIs for the different objects, right? Isn't there a way to tell cython "hey, don't do that"? The type objects are opaque on PyPy, so from my very small world this is unneeded complication.

hodgestar commented 7 months ago

I am a bit troubled by the many shape objects. They are used to access low-level, implementation specific APIs for the different objects, right?

The shapes provide a means to access the internal struct of built-in objects after the PyObject header part. It mimics what happens for user defined subclasses.

It's needed for cases where an extension subclasses built-in types, I think. Someone else please correct me if needed.

@mattip has a point here though -- to what extent are these additional pieces of data really a CPython implementation detail? One could make a good argument that they are entirely an implementation detail.

The question then is whether we can provide a better porting path for existing libraries, or whether implementations just have to mimic this and possibly be slow?

fangerer commented 7 months ago

Isn't there a way to tell cython "hey, don't do that"?

Yes there is but that is not unrelated to the shapes. What Du Toit needs for his Cython/HPy work is essentially HPyContext.h_DictType. Further, he just added the additional built-in shape because dict is a built-in type.

The type objects are opaque on PyPy, so from my very small world this is unneeded complication.

I can understand this. The built-in shapes are not intuitive. I would like to raise your attention to #169 (and the implementing PR #335). Since the issue and PR are a lot of reading, I try to summarize here:

One could make a good argument that they are entirely an implementation detail.

Yes, you can. It could be considered as an implementation detail. But this is similar to, e.g., HPy_GetItem_s. Why do we have it. Wouldn't it be better to just have HPy_GetItem and if the key happens to be created from a C string, then this is an impl detail, right? I see the built-in shapes as a tradeoff between performance and usability. Also, I think the built-in shape didn't make things worse. Please remind that we previously had the legacy flags and the shapes are just a generalization of that.

hodgestar commented 7 months ago

@fangerer Maybe I misunderstood how the shapes are used then. Do we just need their size? And do we only really need them on CPython?

fangerer commented 7 months ago

And do we only really need them on CPython?

We don't strictly need them at all. As I wrote, we could store the information in HPyType_Extra_t at the type. I think that can best be understood by looking at the debug mode version of ctx_AsStruct_*: https://github.com/hpyproject/hpy/blob/bdf7e7ff99194b9ec5d15334303420c0b496f143/hpy/debug/src/debug_ctx.c#L345 The built-in types are just an additional information that can help interpreters.

Do we just need their size?

No, the size would not help alternative implementations. E.g. in GraalPy, we are not allocating objects by size (as CPython does), we are allocating them using a Java class. But it helps GraalPy because we could store the user data pointer in a well-known location of the appropriate Java class.

fangerer commented 7 months ago

Even if this PR raised some discussions about the built-in shapes, I will merge it because this PR is just adding one (missing) shape. So, if built-in shapes are the problem, not merging this PR would solve it. @hodgestar @mattip Any concerns merging this PR?

hodgestar commented 7 months ago

@fangerer Very happy for this to be merged. Let's add the shape discussion for the agenda of the next dev call.