Floating point vs integer and fixed point

penzn commented 1 year ago

A bit of backstory for the discussion, some of this is opinion, but hopefully at least somewhat helpful.

I think it is useful to think about the operations as belonging to two categories: one dealing with floating point semantics and the other with other platform specifics (mostly integer). What this allows is separating questions regarding acceptable floating point output from other, arguably less tricky ones, like encoding invalid values when converting floats to ints. This division is somewhat subjective, but might become clearer with more concrete examples below.

Relaxed versions of existing 'integer' SIMD operations

i8x16.swizzle, different treatment out of bound lane indices
float to integer conversions, different treatment of NaN values and overflow
lanselect, different lane encoding in the mask

Swizzle, laneselect, and float to int converstions in existing SIMD spec have Arm semantics, and new operations match them on Arm, while having different output on x86. Unlike floating point the differences are much more subjective (for example, should the invalid value be all zeros or all ones). It might be even possible to imagine a world where both flavors coexist. Emulating such operations is likely to be less tedious than trying to emulate an operations with better FP accuracy, plus they generally don't deviate from semantics already established for scalar operations.

Relaxed versions of existing floating point SIMD operations

fmin, different treatment of +/- 0.0, NaN inputs
fmax, different treatment of +/- 0.0, NaN inputs

The gist is that x86 operations, unlike Arm operations, "short circuit" on NaN and disregard the sign of zero.

Code that cannot rule out NaN inputs would likely expect more symmetric variants that what x86 is providing natively, and there are well known instruction sequences that would bring the behavior up to, say C++ spec, or one or the other IEEE standard. Obviously, the proposed operations have vastly better performance on x86 than the strict ones, but for code that doesn't rule out NaNs there needs to be some mitigation (along the lines of what native libraries do), which still might be worth it from performance point of view.

New operations

Just to summarize:

Integer and fixed point
- Q format multiplication, different output in case of overflow
- Integer dot product, different behavior w.r.t signed/unsigned values
Floating point
- FMA, different accuracy (fused vs not fused)
- bfloat16 dot product, different accuracy (fused vs not fused), new number encoding

I think in general those have the same FP vs non-FP considerations as above, with a few extras (like single rounding FMA). The fact that those are new may not be an advantage.

penzn commented 1 year ago

This is a partial answer to @titzer's question about what the alternatives for "union" approach are. I haven't looked into the newer operations as close as the old ones.

penzn commented 1 year ago

Looked into this as a side effect of a different project.

FMA

True FMA can only be emulated via integer ops - the inputs need to be broken up into components, both operations performed, then result needs to be rounded and stored back into a float. It should take about 5 additions and 5 multiplication to get the result. This is expensive, though some existing SIMD instructions have even worse lowering (unsigned int conversions for example).

Floating-point min and max

Edit: removed a couple paragraphs describing emulation of x86 floating-point min and max, since we already have those in the standard. Thanks to @abrown for pointing this out.

We have both deterministic variants in the spec already:

f32x4.relaxed_min is either f32x4.min or f32x4.pmin
f32x4.relaxed_max is either f32x4.max or f32x4.pmax
f64x2.relaxed_min is either f64x2.min or f64x2.pmin
f64x2.relaxed_max is either f64x2.max or f64x2.pmax

WebAssembly / relaxed-simd