Open antocuni opened 4 years ago
I'm personally not a fan of casting in this manner. The reason is at least in part related to how we execute this on Graal - via the LLVM bitcode. The problem is that the bitcode doesn't keep any information about the cast around. So we just get an "access this handle at offset such and such" and then we have to figure out what they casted that to based on the actual type that we know is behind the handle and the offset. I would definitely be in favour of the HPyTuple_Check
function for casting, since that communicates to us what type of handle is expected. It also gives us an opportunity to potentially copy things to native at that point when it makes sense, rather than having to copy more eagerly, because someone might do a C level cast.
That being said, I guess we could allow to implement HPyTuple_CAST
via a function call rather than a macro in our API headers, and then we'd get the call. It's still much better for us than a straight C-level typecast. The disadvantage (for us) would be that we'd have to have our own headers for this.
@antocuni Maybe. For now I'll just point out that "obj" is automatically closed
both makes the API more irregular, and is wrong in your example, because a function should not close its arguments.
Also, I think it's similar to what we discussed with custom object protocols. Here is a version that looks more like https://github.com/pyhandle/hpy/issues/9. Note that the advantage is that it doesn't need custom functions or macros to access the size and the elements of the tuple.
void print_tuple(HPyContext ctx, HPy obj)
{
// returns a small struct.
// ``t.items`` is a C array of HPyFields, valid as long as the "obj" handle is
HPy_Tuple t = HPyTuple_AsTuple(ctx, obj);
for (int i=o; i < t.size; i++) {
HPy item = HPy_Load(t.items[i]);
HPy_Print(ctx, item);
HPy_Close(item);
}
}
@timfel
The problem is that the bitcode doesn't keep any information about the cast around. So we just get an "access this handle at offset such and such"
Note that in the current situation, what you get are calls like HPyTuple_GET_SIZE(HPyContext ctx, HPy o)
: so, you don't have any type information anyway. Basically, with this proposal you would get the same bitcode as now, the only difference is that the C compiler does some more typechecking.
The disadvantage (for us) would be that we'd have to have our own headers for this.
A note on the direction where HPy is going right now: at the moment we have two sets of headers: one which is specific to CPython and one which is "universal": the idea is that a module compiled with the universal headers can be imported by any python implementation (including CPython -- but this will be slower: the implementation if cpython-universal
).
Nothing prevents python implementation to provide their own specific headers, of course and/or to decide that it will NOT support universal modules.
Nothing prevents python implementation to provide their own specific headers, of course and/or to decide that it will NOT support universal modules.
I think you mean "decide that it will ONLY support universal modules".
@arigo
For now I'll just point out that "obj" is automatically closed both makes the API more irregular, and is wrong in your example, because a function should not close its arguments.
True :(. My intent was to make more similar the two cases in which you use HPyTuple_CAST
or HPyTuple_Check
. I suppose we could could return a new handle even from HPyTuple_CAST
, albeit this will cause a small performance hit on CPython compared to what you have now.
Also, I think it's similar to what we discussed with custom object protocols.
yes, it's also similar to what is written here: https://github.com/pyhandle/hpy/blob/master/docs/api.md#specialized-protocols
Here is a version that looks more like #9. Note that the advantage is that it doesn't need custom functions or macros to access the size and the elements of the tuple.
void print_tuple(HPyContext ctx, HPy obj) { // returns a small struct. // ``t.items`` is a C array of HPyFields, valid as long as the "obj" handle is HPy_Tuple t = HPyTuple_AsTuple(ctx, obj); for (int i=o; i < t.size; i++) { HPy item = HPy_Load(t.items[i]); HPy_Print(ctx, item); HPy_Close(item); } }
maybe, but this is hard/slowish to implement on PyPy: you would need to allocate a non-movable array and fill it with handles. Or maybe pinning the exising W_TupleObject.wrappeditems
, but then how would you implement it for things like W_SpecialisedTupleObject
?
Nothing prevents python implementation to provide their own specific headers, of course and/or to decide that it will NOT support universal modules.
I think you mean "decide that it will ONLY support universal modules".
No, I meant that if for some reason graalpython
can't load universal modules (e.g. because it can't follow the cast, as @timfel said), it could conceivably decide to drop support for universal and implement only graalpython-specific headers.
Although it is much better to try to design the API so that universal modules are truly universal, of course :)
@antocuni
Note that in the current situation, what you get are calls like
HPyTuple_GET_SIZE(HPyContext ctx, HPy o)
: so, you don't have any type information anyway. Basically, with this proposal you would get the same bitcode as now, the only difference is that the C compiler does some more typechecking.
My point was more that with HPyTuple_Check
, we have a place (a proper API call) where the implementation can attach that type information to the object behind the returned handle. With a C-level cast, we do not. So such a call is just better for us.
I am definitely for having the API enforce type safety via a cast, I would just like that cast to be inside an API function call (so an extern
function in the "universal" header), and only maybe for the CPython headers use a #define
that uses a C-level cast
My point was more that with HPyTuple_Check, we have a place (a proper API call) where the implementation can attach that type information to the object behind the returned handle. With a C-level cast, we do not. So such a call is just better for us.
I think I am missing a point here. The tuple object will be constructed by either calling HPyTuple_New
or by creating the tuple inside python code: so these are the obvious places where to attach information to the object.
Unless you are talking about attaching information to the handle itself: I can't see a reason to do that, but maybe you have a good use case for it that I can't think of.
I am definitely for having the API enforce type safety via a cast, I would just like that cast to be inside an API function call (so an extern function in the "universal" header), and only maybe for the CPython headers use a #define that uses a C-level cast
I see your point. I think this is a design goal that we should try to clarify soon: do we want the universal API to be completely opaque, or are we fine doing some work directly in the header with macros or inline functions?
I see pros&cons for both. The biggest con is efficiency of course: e.g. currently HPy_IsNull()
is implemented as a macro, and it's the preferred way to check whether an API call has returned an error; having to call a function for it sounds excessively expensive (although I am ready to be proved wrong by benchmarks :))
The tuple object will be constructed by either calling
HPyTuple_New
or by creating the tuple inside python code: so these are the obvious places where to attach information to the object.
What if someone creates a native subclass of tuple? These will have to be allocated using malloc
, so that's just some native allocation. Will we force them to do an upcall where we can attach the type information?
What if someone creates a native subclass of tuple? These will have to be allocated using malloc, so that's just some native allocation. Will we force them to do an upcall where we can attach the type information?
To instantiate an object given a type, you'll have to call an HPy_New
or something like that: of course you will be able to use a custom allocator if you want, but the allocator will be called by the HPy runtime. You can't just malloc
a piece of memory and call it an HPy
object, because the underlying memory layout if completely opaque so you will have no idea where to store e.g. the equivalent of ob_type
@antocuni alright, you convinced me :-) I am mostly concerned with macros that access struct members, which I would like to avoid, because in our case that often means a wrapper or exposing that member on the Python object.
I am mostly concerned with macros that access struct members, which I would like to avoid, because in our case that often means a wrapper or exposing that member on the Python object.
Yes sure, I completely agree. The idea of hpy is that all the runtime data structures will be opaque, so no macros and no direct access to fields. Even the HPy
type is opaque: it is big enough to container either a pointer or an index into an array, but it's up to the implementation to decide what to use.
The current C-API provides lots of functions which are duplicate, the only difference being whether they do a typecheck of the argument or not. E.g.,
PyTuple_GET_SIZE
vsPyTuple_Size
,PyTuple_GET_ITEM
vsPyTuple_GetItem
, etc. The following idea tries to:The idea is to have special C types of handles which represent specialized objects and/or protocolos, e.g.
HPyTuple
. You can cast anHPy
toHPyTuple
with or without typechecks. Functions likeHPyTuple_Size
takes andHPyTuple
argument, so it is impossible to call them with anHPy
. For example: