SIMD horizontal adds - Githubissues

ppenzin commented 3 months ago

Originally thought to be a post-MVP feature: https://github.com/WebAssembly/simd/issues/20

There is PR to LLVM to introduce shuffle patterns that combined with other instructions would translate to horizontal additions - motivation to restart the conversation on horizontal SIMD ops or at least provide a way to disseminate this among the runtimes.

@sparker-arm as the author of the patch, sorry to put you on the spot.

sparker-arm commented 3 months ago

From my brief look at this spec, it looks like we'd use dup_odd to pattern match a pairwise operation. So, IIUC, pattern matching with flexible vectors should be a bit easier than the current SIMD shuffles.

But we would still have the trouble of choosing a canonical form that matches well to hardware and for all the runtimes to perform the matching. For instance, the current shuffle approach in LLVM would map to concat_lower_upper and that, again, isn't useful for the horizontal FP instructions that I'm aware of.

So, I would definitely be in favour of having dedicated wasm instruction(s), for both fixed and flexible!

Arm hardware-wise, Neon includes faddp for floats, which are chained for a full reduction, and addv is used for integer reduction. SVE includes faddv, which performs a recursive pairwise reduction on floats, but I'm not sure what we use for integers.

akirilov-arm commented 3 months ago

We have the SADDV and UADDV instructions to deal with integers (signed and unsigned respectively) in SVE. BTW another option for floating-point values is FADDA, which is strictly ordered, but realistically has a performance cost.

There are also pairwise operations, i.e. ADDP and FADDP.

ppenzin commented 3 months ago

I vaguely remember horizontal ops were not great on x86. @rrwinterton, do you have any thoughts?

WebAssembly / flexible-vectors

SIMD horizontal adds #65