tc39 / ecmascript_simd

SIMD numeric type for EcmaScript
Other
544 stars 64 forks source link

shifts behavior with large counts #299

Closed sunfishcode closed 8 years ago

sunfishcode commented 9 years ago

SIMD.js' shift operations currently treat the shift count as a plain unsigned value. This is consistent with what SSE shift instructions do, and is intuitive behavior, however it differs from JS's scalar shift behavior, and also WebAssembly's expected shift behavior.

I propose we change SIMD.js' shift operations to mask the shift count, so eg. a shift of 32-bit values would use count&31, a shift of 16-bit values would use count&15, and so on. This will make them a little slower on SSE, but possibly faster on NEON, and will make them compatible with shifts in JS and other languages.

littledan commented 9 years ago

Sounds reasonable to me. What drove the decision to start with our current semantics?

PeterJensen commented 9 years ago

This was discussed in #72

@sunfishcode wrote:

"I don't have a strong argument here, other than observing that returning 0/-1 is conceptually cleaner for a shift operation, and that also avoids extra overhead on x86."

I think that most uses of shifts will use a constant shift count, in which case there's no additional overhead with and'ing of the shift count operand, so probably not a noticeable impact.

Dan, what's the reasoning behind the shift WebAssembly semantics (and'ing of the count)? As you pointed out the JS semantics is counter-intuitive, so why go with what's counter-intuitive? I've actually been bitten by this in times past.

johnmccutchan commented 9 years ago

Yes, I'm curious why we care about matching JS scalar semantics.

sunfishcode commented 9 years ago

I had pushed to give WebAssembly's scalar operations the more intuitive semantics too, but I've been in the minority on this issue, and the WebAssembly design documents have recently been changed to use masking semantics. With both JS and WebAssembly using masking semantics for scalar, the possibility of achieving a symmetry between SIMD and scalar that spans both languages now seems the most valuable.

johnmccutchan commented 9 years ago

sgtm

sunfishcode commented 9 years ago

Also, the reasoning behind the shift of WebAssembly's semantics is that it's what x86 and arm64 do on scalar, and in the worst case, a scalar and-with-immediate is cheap on all other architectures.

PeterJensen commented 9 years ago

@sunfishcode thanks for providing a little more background. I can see the value in having symmetry between JS and WebAssembly, esp. if asm.js is used as a means of executing webasm code in non-supporting browsers. So, sgtm as well

nmostafa commented 9 years ago

One concern is translation of shift intrinsics to asm.js/wasm with dynamic shift count on x86. Not sure how common are those in existing C/C++ workloads. I am not fond of having an IF condition (see below).

We have to weigh that penalty against the benefit for polyfilled SIMD shift on VMs without SIMD.js support, and vectorized code. For the former, I don't think the benefit is great, since the scalar version of the operation would be slow anyways. And the latter, we could have the compiler do the masking in the target language (asm/wasm) instead, and it shouldn't be expensive (a mere AND).

For intrinsics:

// v: input value, s: shift amount
if (s > 31)
{
   result = SIMD.Int32x4(0,0,0,0);
}
else
{
  result = SIMD.Int32x4.shift*(v, s);
}
sunfishcode commented 8 years ago

In this week's SIMD conference call, there was general agreement that we should adopt the masking semantics, for consistency with scalar JS among other reasons.

This will make emulating x86 intrinsics on top of SIMD.js harder, as noted above, though note that it's only for dynamic-count shifts, which are a relatively rare use case. And, this area is a good candidate to address through future API extensions which expose more architecture-specific features (possibly through a mechanism such as this).

PeterJensen commented 8 years ago

I create a PR to the spec for this here: #320

PeterJensen commented 8 years ago

The PR above was merged, so closing this one