data-apis / array-api

RFC document, tooling and other content related to the array API standard
https://data-apis.github.io/array-api/latest/
MIT License
215 stars 45 forks source link

stream in `__dlpack__` should not be required to be integers #183

Closed oleksandr-pavlyk closed 3 years ago

oleksandr-pavlyk commented 3 years ago

https://data-apis.org/array-api/latest/API_specification/array_object.html#dlpack-self-stream-none

says that stream keyword argument in __dlpack__(self, /, *, stream=None) should be an integer or None.

However, in dmlc/dlpack#57 discussion (see https://github.com/dmlc/dlpack/issues/57#issuecomment-771854478) in the case of working with SYCL, the stream may need to be a class representing an in-order sycl queue.

Can we remove the requirements for stream to be an integer ?

leofang commented 3 years ago

I think this is reasonable. We can specify the object type allowed for each platform if stream is not None, since we are already tabulating it anyway. So, for example, we can move the integer requirement to CUDA/HIP, and specify another acceptable type for SYCL.

in the case of working with SYCL, the stream may need to be a class representing an in-order sycl queue.

Is there a determined Python type for an in-order sycl queue?

Another thing: are you happy (from the SYCL perspective) with the optional kw argument being named as "stream"? I don't remember what was your opinion (or if you were consulted when it's proposed) 😅

oleksandr-pavlyk commented 3 years ago

The Python type passed to __dlpack__ would be dpctl.SyclQueue (see https://intelpython.github.io/dpctl/latest/docfiles/dpctl_pyapi/SyclQueue.html) such that it is_in_order property is True.

In lieu of the SyclQueue, a "SyclQueueRef" capsule carrying a pointer to sycl::queue class instance could be accepted (from which dpctl.SyclQueue can be created). Requirement for the queue to be in order will be dynamically checked.

The name of the keyword does not align with SYCL (see page for sycl::stream class).

Perhaps it would be a good idea to use a reasonably self-explanatory name for the keyword related to its intended use, e.g. synchronize. We could then say that CUDA/HIP, one should use synchronize=stream and include present discussion of stream value choices, and for SYCL we could do synchronize=in_order_queue.

rgommers commented 3 years ago

This seems reasonable indeed, and consistent with the discussion in https://github.com/dmlc/dlpack/issues/57#issuecomment-771854478.

By the way @oleksandr-pavlyk, doesn't DLPack need SYCL added to DLDeviceType before this helps?

oleksandr-pavlyk commented 3 years ago

@rgommers Yes, I intend to open a PR for dlpack requesting to add 3 new types kDLSYCL_GPU, kDLSYCL_CPU and kDLSYCL_ACCELERATOR in a matter of days.

rgommers commented 3 years ago

I'm working on a doc update now. I'm not so sure about a synchronize keyword, that sounds more like a boolean option. stream also aligns with __cuda_array_interface__. Explicitly adding a SYCL note and extending the type annotation should be clear enough I hope.

Question on dpctl.SyclQueue, is that a thing that other Python libraries are going to understand? I.e. is it the default/only representation of cl::sycl::queue at the Python level? If it's specific to Intel’s DPCPP compiler or the dpctl library, then it's probably not the best thing to include?

rgommers commented 3 years ago

Otherwise, shouldn't it just be an opaque PyCapsule that wraps a cl::sycl::queue object?

oleksandr-pavlyk commented 3 years ago

dpctl.SyclQueue is a Python class that stores the cl::sycl::queue object, but it also provides plenty of additional methods and properties, which PyCapsule would not.

dpctl.SyclQueue provides a way to retrieve the capsule, using queue._get_capsule() method.

dpctl is designed to be built with DPC++ toolchain and uses some of sycl::ONEAPI extensions.

I do not insist on documenting dpctl.SyclQueue as the type for the stream= keyword, as long as it is allowed to be a Python object such that passing dpctl.SyclQueue or PyCapsule would be permitted.

rgommers commented 3 years ago

Thanks for the explanation @oleksandr-pavlyk.

as long as it is allowed to be a Python object such that passing dpctl.SyclQueue or PyCapsule would be permitted.

Then how about a type annotation stream : Optional[Union[int, Any]] and a note like "For other device types which do have a stream, queue or other synchronization mechanism, the most appropriate type is not yet determined. E.g., for SYCL one may want to use an object containing an in-order cl::sycl::queue. This may be standardized in a future version of this API standard".

BvB93 commented 3 years ago

Out of curiosity, what would the signature of a reasonable __dlpack__ implementation currently look like with these changes? Currently it seems that the only type that safely be passed stream parameter of an arbitrary __dlpack__ method is None, with any other type (be it int, dpctl.SyclQueue or something) potentially raising depending on the particular implementation.

Perhaps I'm being too critical here, but this diversification with (potentially) highly library-specific parameter types sounds counterproductive to the standardization the Python array API standard is trying to achieve.

oleksandr-pavlyk commented 3 years ago

@BvB93 This is why the standard should not unnecessarily restrict the type of the stream keyword. It should be a allowed to be any Python object understood by both producer and consumer of the DLTensor.

BvB93 commented 3 years ago

@BvB93 This is why the standard should not unnecessarily restrict the type of the stream keyword. It should be a allowed to be any Python object understood by both producer and consumer of the DLTensor.

That's understandable. I'd just like to emphasize that, at the current rate, you're heading in a direction wherein no type safety can be guaranteed for the stream keyword of the general __dlpack__ protocol, the latter being at the whim of the implementation in question (this is assuming that an implementation will raise upon receiving an unsupported object type, that is).

leofang commented 3 years ago

this is assuming that an implementation will raise upon receiving an unsupported object type

If I have a library supporting all these platforms (CUDA, HIP, SYCL, Metal, etc) then unfortunately this runtime check is a price I have to pay, since DLPack attempts to cover all these device types.

rgommers commented 3 years ago

then unfortunately this runtime check is a price I have to pay, since DLPack attempts to cover all these device types.

I agree, not really a way around that. It's better than have separate methods like:

Because then you're back to "what do I do if my library supports multiple device types".