tc39 / ecmascript_simd

SIMD numeric type for EcmaScript
Other
543 stars 64 forks source link

SIMD.js on non-SSE devices #317

Closed nmostafa closed 8 years ago

nmostafa commented 8 years ago

We had the agreement to have SIMD functionality always enabled regardless of whether the implementation is optimized or not. This also seems to be the opinion of TC39.

This poses a problem on non-SSE devices. In Chakra, the unoptimized implementation (runtime library) uses SSE2 intrinsics. This was a design choice to guarantee identical semantics to JIT'ed code (no change in precision, rounding .. etc, when transitioning from interpreter to JIT and vice-versa). If SIMD is enabled without SSE2, the runtime library will not function, and hence the problem.

One solution is to have a non-SSE sequential implementation for each operation as a fall-back code if SSE2 is not available, which will be OK since JITing of SIMD ops will be disabled on such platforms. This obviously a large amount of work in development, maintaining and testing, to be part of the runtime. Another solution, is to use the JS polyfill on those platforms, but I am not sure if that would be spec-compliant.

Or should we re-consider making SIMD.js optional (based on platform features, optimized implementation) ? Thoughts ?

johnmccutchan commented 8 years ago

This poses a problem on non-SSE devices. In Chakra, the unoptimized implementation (runtime library) uses SSE2 intrinsics. This was a design choice to guarantee identical semantics to JIT'ed code (no change in precision, rounding .. etc, when transitioning from interpreter to JIT and vice-versa). If SIMD is enabled without SSE2, the runtime library will not function, and hence the problem.

SSE2 has been in x86 chips for ~15 years starting with the Pentium 4. Windows 8 and above require a CPU that has support for NX Bit. That was included in the Pentium 4. So, how can someone run Chakra on a chip without SSE2?

Or should we re-consider making SIMD.js optional (based on platform features, optimized implementation) ? Thoughts ?

No, we should not re-consider making SIMD.js optional.

littledan commented 8 years ago

Two points:

nmostafa commented 8 years ago

@johnmccutchan

SSE2 has been in x86 chips for ~15 years starting with the Pentium 4. Windows 8 and above require a CPU that has support for NX Bit. That was included in the Pentium 4. So, how can someone run Chakra on a chip without SSE2?

True, but I am talking about low-end IoT devices (e.g. Intel Quark). SIMD.js for such devices would require C++ or x87-based code-gen implementation. And it seems pointless to me to allow the feature where there is no SIMD ISA to start with, and no performance benefit.

@littledan, good points ..

The SIMD.js spec is already well-defined without executing on actual SIMD hardware, and V8 has successfully been going about its implementation by starting with a C++ implementation rather than SSE for its baseline cross-platform implementation

Do you know if this implementation is used along with the optimized one ? Do you bail out from optimized SSE code to C++ implementation ?

TC39 can't stop companies from shipping non-spec-compliant JavaScript implementations without full functionality.

Good point. But it seems that being spec-compliant is a goal of IoT JS engines (see JerryScript), and like you mentioned, we would need at least C++ implementation for these platforms.

jfbastien commented 8 years ago

I don't understand the proposal: what would JS code using SIMD.js do if an implementation were to not implement the SIMD.js option? Simply not work?

How is that different from current SIMD.js (which isn't optional), where an engine decides to diverge by not implementing SIMD.js? From a user's perspective it still doesn't work.

Users lose in both cases! I don't get the advantage of saying that something is optional. What am I missing?

nmostafa commented 8 years ago

I don't understand the proposal: what would JS code using SIMD.js do if an implementation were to not implement the SIMD.js option? Simply not work?

I imagine the JS code would have a fall-back sequential version of the vectorized kernel. JS code might look like this:

if (SIMD !== undefined)
{ /* vectorized version */ }
else
{ /* sequential version */ }

One would expect that the vectorized path should always be faster. However, if SIMD is always defined, regardless of performance, we may end up with vectorized path being slower. That's because a generic baseline implementation requires a call to the runtime for each SIMD op, while the sequential code can be type-specialized. So by making SIMD optional, we reflect implementation status.

Another advantage is not investing in a generic C++ implementation that users don't really care about.

johnmccutchan commented 8 years ago

@nmostafa TC-39 has already decided that SIMD.js is not going to be made optional. This won't be relitigated. I suggest that the Chakra team follow the V8 team in developing a generic C++ (or JavaScript) implementation as the fallback.

littledan commented 8 years ago

For more general background, TC39 has repeatedly rejected the idea of making a more embedded, IoT-friendly profile, though it has been proposed several times. I gave Apple space to bring up this topic again at the January meeting, and it was again rejected.

nmostafa commented 8 years ago

Thanks for the clarification, @littledan.