NVIDIA / cccl

CUDA Core Compute Libraries
Other
961 stars 119 forks source link

Add non-commutative reduction #774

Open brycelelbach opened 3 years ago

brycelelbach commented 3 years ago

We're missing a very useful form of reduction algorithm:

image

jrhemstad commented 3 years ago

How about left_fold for the algorithm name?

bdice commented 2 years ago

This paper about ranges::fold is somewhat related: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2322r5.html, https://github.com/cplusplus/papers/issues/1004

upsj commented 1 year ago

As I'm looking into this from the cub side, maybe also a few thoughts:

jrhemstad commented 1 year ago

See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2214r0.html#rangesreduce for a discussion of naming between fold and reduce.

I think the precedent is clear for fold being the appropriate name for a non-commutative reduce.

Granted, all the of discussion in p2214 and p2322 are around a sequential implementation of fold. It could be problematic if fold is spec'd in such a way to require sequential execution.

I'll defer to @codereport @ericniebler and others on the appropriate naming for a non-commutative, parallel reduce/fold.

codereport commented 1 year ago

I like the idea of reduce_*, something like reduce_assoc or reduce_associative_only/reduce_noncommutative. The former isn't great because assoc has meaning in other contexts and the latter is a bit verbose. I honestly don't think it matters too much though. After C++23, we have:

It isn't like C++ has a very cohesive naming story when it comes to reductions.

bdice commented 1 year ago

reduce_noncommutative, fold, or reduce_ordered are my favorites but I lean towards reduce_noncommutative most strongly out of the options presented.

Compare to fold: I'm used to hearing "left fold" or "right fold", but this is neither -- we wish to promise only the preservation of operand order (associative but noncommutative). I would be happy with fold if we had consensus that it's unambiguous that its action promises no ordering -- or ambiguous enough that it prevents users from making an assumption about ordering. 😛

Compare to reduce_ordered: The name reduce_noncommutative is explicit about its relaxed assumption (relaxed compared to reduce), as opposed to the alternative reduce_ordered which tells me less. For the same reason, I don't think reduce_associative_only is the best choice because it doesn't name the part that is different compared to reduce.

upsj commented 1 year ago

Just to give the reasoning behind my suggestion, I agree with most of the arguments here: reduce_ordered describes the algorithm behavior, while reduce_noncommutative describes the algorithm assumptions. In my use case, the ordering comes up explicitly by joining adjacent blocks without reordering, but this can of course also be framed in terms of commutativity. I'm kind of math-biased: Is commutativity (the term) known well-enough to use it?

bdice commented 1 year ago

I don’t really know how the (parallel) reduce algorithm implements its partial results / reordering, and that doesn’t seem important to explain in the name. Knowing the mathematical concept of commutativity is already a prerequisite for understanding the appropriate application of reduce, so we can have a reasonable expectation of that being understood by the user (or easily searchable because it has a specific and unambiguous meaning).

The behavior is non-deterministic if binary_op is not associative or not commutative.

https://en.cppreference.com/w/cpp/algorithm/reduce

codereport commented 1 year ago

@elbeno I know you don't work at NVIDIA but I know you have wanted this algorithm for a while and might have some thoughts on the name :slightly_smiling_face:

codereport commented 1 year ago

I just discovered that Futhark actually has both of the reductions. They are called: