Clear distinction between C, OpenMP and OpenCL primitives

Currently it is possible to create Phrases that are going to be passed to the OpenCL backend that include C and OpenMP primitives such as ReduceSeq, while OpenCLReduceSeq should be used.

The difference between the primitives is that OpenCLReduceSeq takes as an additional parameter the AddressSpace of the accumulator.

I see two angles from which this problem can be viewed at the moment.

One is that that ReduceSeq should not be core which would then allow us to only include core and OpenCL to choose primitives from when creating an OpenCL expression, removing the need for the prefixes (however, C could still be used accidentally).

The reasoning would be:

The OpenCL prefixes in front of these primitives are mainly needed because there are other primitives like ReduceSeq in the core package which will always be included but cannot be used in OpenCL. For a convenient use of the OpenCL primitives, those primitives have completely different names and are not just in another package.

This shows, placing ReduceSeq in core is actually a design mistake, because it cannot be used in OpenCL and there cannot be part of a unifying core between the backends (there are more primitives with this problem such as OpenCL-/SlideSeq or OpenCL-/Iterate).

The other angle this problem can be viewed from is that there should not be a difference between the primitives between the backends. This would mean that there would be only a single ReduceSeq in core and this primitive takes the AddressSpace as a parameter. Depending on the backend the definitions of the Kinds would change, e.g., AddressSpace = Heap | Stack for C and AddressSpace = Private | Local | Global for OpenCL. However, at least currently, we would not make any use of the address space in C and would be carrying around unused information. Also, I do not immediately see how we can make sure that the correct AddressSpace is used while constructing expressions.

Do you have any thoughts or suggestions for this? Is there another angle that I missed? Which option would you choose?

I agree with the two problem angles. I think that the first angle is the simplest to adopt because it gives more freedom to the backends, even if unifying the primitives does not work out. It also does not prevent trying to unify the primitive interfaces as much as possible to have a consistent experience between backends. This also aligns with the fact that I believe we should separate some C features from our core. For example if we want to allow heap allocation in the future, this should not always be available and therefore not be in the core. This is also related to vectorization primitives that are currently in the core but would ideally be an additionnal feature, maybe shared between backends that support it. To summarize, I think we should go for the simplest solution and keep collecting design issues or improvement ideas until implementing them becomes necessary.

rise-lang / shine

Clear distinction between C, OpenMP and OpenCL primitives #26