Overview of what is supported for kernel programming

mattcbro commented 5 years ago

Many thanks to the developers of this amazing package! I thought I would submit a list of gotchas and issues I ran into trying to get my kernel to work. Most of these are known, and I will try to provide links where appropriate.

I realize this may be more appropriate in the documentation, but unfortunately it is a very fluid situation and I couldn't figure out where I should contribute my usage notes.

The power operator ^ does not work in any kernel code. Use CUDAnative.pow()

Ordinary array constructors do not work, ie [1, 2, 3], or Array([1,2,3]) etc.

Theoretically you could use StaticArrays() and Marrays, Mvectors etc. Unfortunately a bug may cause intermittent memory faults see JuliaGPU/CUDAnative.jl#340 https://github.com/JuliaGPU/CuArrays.jl/issues/278 . Hope to see this resolved soon, though the current #master branch for this package does not compile on my linux mint 19 machine.

What does work are tuples, for small arrays, ie (arr[1,2], arr[2,2], arr[3,2]) instead of arr[:,2] . I'm also seeing MUCH faster performance using tuples than Mvectors for this sort of thing.

reshape() does not work and you can not do array slicing in a kernel. However you can use view(). ie instead of arr[:,j] do view(arr, :, j)

Instead of sqrt() use CUDAnative.sqrt() .

There is no real support for complex exponentials. So instead of doing exp(j * theta) do CUDAnative.cos(theta) + 1im * CUDAnative.sin(theta).

@device_code_warntype in front of your @cuda call is quite useful for ironing out type instabilities, which must be quashed in your kernel code.

The stack traces for kernel code that does not compile are often not quite right. I think this is going to be resolved shortly. JuliaGPU/CUDAnative.jl#306

Since the affordable Nvidia GPUs have nerfed support for double's one often desires to use Float32 for your floating point arrays. As such remember to properly create Float32 and avoid mixing Float64 with Float32. The same applies to complex types. Also you can force 32 bit floating point constants using a syntax like 1.0f0. This will avoid a bunch of spurious type instabilities.

maleadt commented 5 years ago

This should probably be put in the documentation.

mattcbro commented 5 years ago

Sure, where do you want it to go?

calebwin commented 4 years ago

Is there any update on this? Is there an up-to-date list of what CUDAnative.jl supports from the Julia language?

maleadt commented 4 years ago

No, there isn't.

JuliaGPU / CUDA.jl

Overview of what is supported for kernel programming #58