Does wasm have or need a cpuid opcode?

NWilson commented 6 years ago

I can't seem to find a "cpuid" instruction in Wasm, yet it would seem to be indispensable for more opcodes to be added to Wasm in future.

When any new opcodes are added (eg atomics, or wait/wake, or memcpy, ...) people will want to write code like this:

void memcpy(void* a, void* b, size_t len) {
  if (__builtin_wasm_cpuid(0 /* i32 page number */) & WASM_OP_ID_MEM) {
    __builtin_wasm_memcpy(a, b, len);
  } else {
    for (int i = 0; ...) etc
  }
}

Given the potential explosion of new opcodes being added to Wasm in future, the ability for a Wasm module to detect browser support for the opcodes clearly has value.

Is this capability available, or am I missing something?

Proposal

A Wasm opcode, which takes an i32 argument, and returns an i32. If the embedder does not recognise the argument value (it's defined in the future), the return value is zero. If the embedder does understand the argument, the return value has the bits set which correspond to implemented Wasm features. The argument value is a "page" of cpuid bits, allowing for feature detection of more than 32 different features or opcodes. It expected that opcode support bits will be added first to page zero, then page one when that is filled, and so on.

Each new opcode (or semantic alteration on top of existing opcodes) can be detected at runtime via a Wasm cpuid bit.

Example:

"Bit zero of page zero denotes support for the Wasm memcpy/memset instructions" (WASM_OP_ID_MEM). If the embedder does not support either or both of those opcodes, it shall return zero for bit zero of page zero.

jfbastien commented 6 years ago

We've settled on feature detection as the way to do this for now, though I don't think there's a specific issue outlining it. There's a bunch of discussion on the topic.

So you'd create a tiny .wasm using the new opcode, and .validate it to see if the feature is supported. The specific code pattern you point out has also been discussed, and I believe we want to discourage it.

NWilson commented 6 years ago

That's very surprising! So if I have a big application (eg 1MB of Wasm), you're expecting me to compile it multiple times (once with the "memcpy" opcode, once without)... and if there are say three/four proposals adding new opcodes, I'd have to compile it 16 times, and then at runtime the JS code would load the version of the application that has the most supported features?

Given there are half a dozen proposals open already, this could get out of hand quickly. It puts a lot of burden on the end developer - maybe the developer doesn't know or care whether there's an optimised memcpy opcode available, he's just confused why he has to compile/link two versions of has application.

On desktop platforms, it's overwhelmingly common to detect support for individual SSE instructions at runtime (using CPUID to check for SSE4) - and still distribute a single application or library to the consumer.

The only mitigation I can think of is if you're thinking of making all new opcodes mandatory to implement? For example, you could have Wasm MVP, Wasm v2, Wasm v3, each with more opcodes that the previous one, and an embedder has to implement all features of Wasm v2 before moving on to opcodes in Wasm v3. That would mean that by the time you've reached Wasm v4, you only need to distribute 4 versions of your Wasm app, instead of 16 if you had to build once for every combination of features.

I can't see that being popular either though - surely you don't want the various Wasm proposals to stack linearly like that, since it interferes with shipping orthogonal features independently. You don't want to force all browser vendors to implement all the proposals in exactly the same order.

NWilson commented 6 years ago

Previous decision was recorded here: https://github.com/WebAssembly/design/issues/416#issuecomment-188505172

Apparently, yes, devs are expected to compile many (dozens of?) versions of their app and select between them at runtime. Gross and unscalable.

My expectation was that Wasm embedders will trap on executing an unknown opcode, but will happily load and run the Wasm module (as long as the unknown opcode is not reached). This should be very easy to support in AOT compilation, simply replacing unknown instructions with trap.

jfbastien commented 6 years ago

Most new features are pretty huge, such as SIMD and threads, and can't really be picked at runtime based on fine-grained feature detection. The smaller features tag along for the ride.

The intent isn't to number WebAssembly versions. One implementation may do threads before SIMD, and another the reverse. The WG will publish incremental versions, but new features aren't mandated as dependencies on each other.

You don't need to use all the features in a perfect harmony! Your toolchain can be told to emit 2 or 3 variants if that really matters, enough to capture say 99% of deployed browsers. Eventually caniuse.com will show big enough support for certain features that you can just lump them together.

My expectation was that Wasm embedders will trap on executing an unknown opcode

No, embedder fail validation on unknown opcodes. The way opcodes are encoded we don't know how many immediate they have, so an unknown opcode simply cannot be skipped, though you could skip the entire function and trap on entry.

rossberg commented 6 years ago

The idea for feature detection is that you construct a tiny probe module with dummy uses of the features you want to test for and validate that. Features are more than opcodes. The approach you suggest doesn't work for anything that is not just a simple instruction (e.g., because it involves new types).

Wasm is a low-level language, so feature growth is expected to be relatively slow. There is no reason to assume much divergence among active implementations, or a highly non-linear evolution. So in practice it's rather unlikely that there'll ever be a need for more than, say, two versions, especially if you can factor out uses of a "new" feature into a dedicated module that you can just swap.

NWilson commented 6 years ago

OK, if that's what's desired... under this scheme I basically can't see many developers (including myself) gaining any benefit from new Wasm instructions, until 100% of the web has upgraded their browser to a version that supports it. Conditionally-compiling our app multiple times for each browser variant would be just too much hassle.

I'd have thought that SIMD would be an ideal candidate for feature detection. After all, on desktop platforms, compilers like GCC routinely do auto-vectorisation, and include a CPUID check to use SSE2/SSE3/SSE4 depending on what's available. Oh well.

Closing.

WebAssembly / design

Does wasm have or need a cpuid opcode? #1161

Proposal