chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.75k stars 410 forks source link

Support all/more kinds of reductions on GPU #24932

Open e-kayrakli opened 2 weeks ago

e-kayrakli commented 2 weeks ago

https://github.com/chapel-lang/chapel/pull/24787 will add +, min and max reductions. These are probably the most common reduction kinds, but more importantly, CUB/hipCUB has direct support for those.

minloc and maxloc reductions are also supported in CUB via ArgMin and ArgMax functions. We'll need to wire some more things from the compiler into the runtime. I expect this work to be relatively straightforward implementation in the compiler and the runtime.

For other reduction types, we need engineer a way to use CUB's generic Reduce interface where we pass a function of ours into CUB to handle the reduction. IOW, CUB is supposed to call the accumulate function of the user-defined reduction implementation. Needless to say, the priority should be for those reduction kinds that we already have and not any kind of generic user-defined reductions.