KhronosGroup / SYCL-Docs

SYCL Open Source Specification
Other
114 stars 68 forks source link

How to extract address space from raw pointers? #21

Open j-stephan opened 5 years ago

j-stephan commented 5 years ago

Imagine a device-side function with the following signature:

void foo(int* vec);

I don't know if vec comes from global, local, constant or private memory. However, inside foo I'd like to do something to vec which requires me to know the address space of the pointer, e.g. a cl::sycl::atomic_fetch_add. How do I tell the multi_ptr / atomic inside foo which address space is needed? Simply using a global_ptr will break if vec actually resides in local memory. Using multi_ptr will fail because the address space template parameter is missing. Creating an atomic by passing vec to its constructor will fail because vec isn't a multi_ptr. Using atomic_fetch_add on vec will fail because vec isn't an atomic type.

Some implementations (like ComputeCpp) internally use __global to annotate the pointer during device compilation. But even if there was a way to write something like void foo(__global int* vec) (there isn't as far as I know, ComputeCpp complains if I do this) this would be a bad idea because the address space attributes are implementation-defined.

Why do we need this? Sadly, there are libraries / frameworks out there that pass around raw pointers but where a SYCL backend is planned / worked on.

Edit: I also tried to overload foo with global_ptr, local_ptr etc. directly. This will fail because the call is ambigous.

keryell commented 5 years ago

Interestingly, Intel is trying hard to hide what you are asking for: https://github.com/intel/llvm/pull/348

Can you imagine an API that could be added to the standard?

j-stephan commented 5 years ago

An easy solution that doesn't require an API change would be to correctly deduce the overloads, i.e. foo(global_ptr), foo(local_ptr) and so on. This is not very intuitive, though, and might break user APIs.

From the programmer's point of view it would be preferable to allow multi_ptr construction on raw pointers without having to specify the address space. The compiler should be able to figure this out by itself since it knows about the address spaces anyway. On the other hand it should raise an error if the programmer tries to assign a raw pointer in local space to a global_ptr. Currently this doesn't happen, both the Intel and ComputeCpp compiler will happily compile if I pass the same pointer to global_ptr's and local_ptr's constructor.

Admittedly I haven't given this much thought yet (I only encountered the problem on Wednesday), I'll try to think this through on the weekend.

j-stephan commented 5 years ago

The weekend has passed... Apart from the solutions above the best I could come up with is something like cl::sycl::pointer_traits to be added to the specification. The interface would look something along the lines of

template <typename Ptr>
struct pointer_traits
{
    static_assert(is_raw_ptr_type(Ptr), "Ptr needs to be a raw pointer type");
    using pointer_t = /* implementation-defined */ Ptr;
    using address_space = /* implementation-defined */;
    // maybe add other traits here
};

Since the compiler needs to figure out the address space on its own anyway (if I understand Section 6.8 correctly), it would fill out the implementation-defined parts. A programmer could then use SFINAE or if constexpr to adapt to the different address spaces.

This is basically the problem multi_ptr tries to solve, it already encapsulates the functionality above. However, multi_ptr requires the user to specify the address space before using it. This makes sense because we can request a multi_ptr from a buffer accessor, a local accessor, and so on and the multi_ptr data structure has to know about its address space. It also renders us unable to construct it from a pointer we don't know the address space of.

So my straight-forward resolution still is to remove the requirement to specify the address space for the multi_ptr type. Instead the compiler needs to figure out the correct value for the address_space member of multi_ptr (or the Space template parameter). If this is not an option because of implications I'm not aware of (and I'm sure there are plenty) I'd shoot for the pointer_traits option.

keryell commented 5 years ago

Since the compiler needs to figure out the address space on its own anyway (if I understand Section 6.8 correctly), it would fill out the implementation-defined parts. A programmer could then use SFINAE or if constexpr to adapt to the different address spaces.

The problem is that this address space resolution can be done in LLVM or even in the SPIR-V backend or whatever... So you might not have this information inside Clang as a type trait... :-(

multi_ptr was designed:

j-stephan commented 5 years ago

The problem is that this address space resolution can be done in LLVM or even in the SPIR-V backend or whatever... So you might not have this information inside Clang as a type trait... :-(

I have to admit that my knowledge about compiler construction is a bit limited. But the backends will have to look up this information, too - why can't the frontend do the same?

To avoid requiring this kind of address-space inference by avoiding using raw pointers. Of course this means passing around the multi_ptr type. But with auto nowadays it is easier;

While I can understand this intent with regard to new code I believe this is an oversight if we consider legacy code bases. If those have a raw pointer API the design of multi_ptr or the lack of a feature to otherwise extract the address space becomes a major obstacle.

bader commented 4 years ago

SYCL WG: there is an ongoing discussion internally. We will get back as soon as we have an update.

fraggamuffin commented 2 years ago

related? https://gitlab.khronos.org/sycl/Specification/-/issues/607 need to decide if this is the right direction for SYCL