ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.8k stars 2.47k forks source link

@reduce builtin #2698

Closed shawnl closed 3 years ago

shawnl commented 5 years ago

Accepted Proposal


https://github.com/ziglang/zig/issues/903#issuecomment-492300314

Branching on SIMD is complicated.[1] If only some of the values match you have to both execute the if and the else block, and merge the results. I think this should be explicit, which @Vector(_, bool) (which all vector comparisons return, although I do not fully understand the difference in zig between u1 and bool) having built-in functions that return bool: .all() and .any().

You might then run some scalar code, and if that code is pure then the compiler can still safely do optimizations that result in it being processed possibly twice. So in addition to a comptime block I think we also need a pure block that would give a compile error if the block is not pure.[2]

[1] Here is HSAIL talking about wavefront scheduling http://www.hsafoundation.com/html_spec111/HSA_Library.htm#PRM/Topics/02_ProgModel/wavefronts_lanes_and_wavefront_sizes.htm%3FTocPath%3DHSA%25C2%25A0Programmer's%2520Reference%2520Manual%2520Version%25201.1.1%2520%7CChapter%25202.%2520HSAIL%2520Programming%2520Model%7C2.6%2520Wavefronts%252C%2520Lanes%252C%2520and%2520Wavefront%2520Sizes%7C_____0 [2] I don't want to add too many keywords. Perhaps we could have attribute that can be attached to blocks, functions, variables, and other things that are just enum literals? (This is also very similar to C where _ is reserved.)

.comptime {
   foo();
}
.pure {
   bar();
}
fn extern .luacc call_my_lua_func(f: u32);
andrewrk commented 5 years ago

Thanks for bringing up this issue. This is something where you have a good understanding of the problem and myself (and probably others) will have a hard time following along until either we (1) go spend a significant time learning about SIMD and trying to use it or (2) you help us understand this use case more. One way you could do (2) would be to provide some Zig code using the current SIMD features, and demonstrate how it is unable to satisfy the use case. Then provide an altered example with your proposed feature and explain how it solves the problem.

Such an example would be necessary for a test case anyway, and would propel this issue forward.

shawnl commented 5 years ago

I opened this largely as a note to myself, partially to not forget things. This is mostly blocked on #903 , and yes I do want to explain some more what is going on.

andrewrk commented 5 years ago

@shawnl will you consider this counter proposal to your all, any thing?

@fold(comptime op: FoldOperator, vector: var) @typeInfo(vector).Vector.child

const FoldOperator = enum {
    Add,
    Sub,
    Mul,
    And,
    Or,
    Xor,
    // more?
};

it does, in pseudocode,

var result = operator(vector[0], vector[1]);
result = operator(result, vector[2]);
result = operator(result, vector[3]);
// ...
return result

Actually I don't see why this would even need to be a builtin. This would work just fine in userland.

So the example with an if statement:

if (fold(.And, v1 >= v2)) {
   // do your logic
}

I'm sure there's a better word than fold, maybe reduce? The functional programming people have great vocabulary words for these things.

(edit) The accepted proposal is for this to be a builtin. Continue reading below for reasoning.

shawnl commented 5 years ago

Actually I don't see why this would even need to be a builtin. This would work just fine in userland.

If we use a builtin we can use the llvm intrinsics for horizontal folding, unless it is possible to optimize to them. (note that fast math flags are considered, https://llvm.org/docs/LangRef.html#llvm-experimental-vector-reduce-v2-fadd-intrinsic )

I really like this proposal. And LLVM is using reduce, which isn't bad and better than fold.

andrewrk commented 5 years ago

Oh, nice, looks like it should be a builtin then. Whether we choose to use the experimental APIs or not can be an implementation detail. It looks like LLVM API and me came up with the same ideas :-)

gggin commented 4 years ago

Sorry, I misunderstand sth. A question, I care about the thing is the pure block. Is this be accepted?