JuliaORNL / JACC.jl

CPU/GPU parallel performance portable layer in Julia via functions as arguments
MIT License
21 stars 13 forks source link

Would like to pass in more than just numbers and arrays #56

Open PhilipFackler opened 7 months ago

PhilipFackler commented 7 months ago

This commit causes me problems: https://github.com/JuliaORNL/JACC.jl/commit/6d1265714f58855c2e5142835fe274350d4971ae

parallel_for now accepts only Numbers and Arrays (or CuArrays in the cuda case). I'd like to pass in instances of structs and tuples also. Perhaps the function could use Vararg{Any}?

williamfgc commented 6 months ago

@michel2323 please let us know your thoughts on this issue. Thanks!

michel2323 commented 6 months ago

Oh ok. This restrictive. Okay. I'm gonna take look.

michel2323 commented 6 months ago

The problem with Vararg{Any} is that dispatch and precompilation won't work. We need different types for CPU, CUDA, etc.

michel2323 commented 6 months ago

How do you want to decide a struct should be on the CPU or GPU in the loop?

michel2323 commented 6 months ago

I looked at OpenMP and they have the target in the #pragma. So one solution would be to pass a backend argument to the parallel_for etc. CUDA.jl, oneAPI.jl, and AMDGPU.jl provide a CUDABackend, oneAPIBackend(), and ROCBackend(). See for example here https://github.com/JuliaGPU/CUDA.jl/blob/7f725c0a117c2ba947015f48833630605501fb3a/src/CUDAKernels.jl#L21 .

For the CPU we could define our own or use it straight from KA. All this would also remove the need for JACC.Array.

PhilipFackler commented 6 months ago

How do you want to decide a struct should be on the CPU or GPU in the loop?

This is when launching a kernel. As long as the struct meets isbitstype I should be able to copy it into the kernel just like a Number. And this works with the earlier version that uses ... parameters.

kmp5VT commented 6 months ago

@michel2323 @PhilipFackler I have a general rough draft of this idea in this PR. This is similar to how we launch CPU vs GPU kernels in the ITensors.jl package

williamfgc commented 6 months ago

@kmp5VT thanks, we'd like to explore these ideas keeping the API really simple for end-users who would like to stay closer to their science and not necessarily computational aspects. Otherwise, there is very little added value for using JACC.

A good exercise is to examine the final API and integration effort. Currently, we are prioritizing issues that allow for easy integration with apps and less maintenance. Most of the design decisions are from looking at apps and what they need and make incremental progress.

@PhilipFackler posted a minimal example in #51