WebAssembly / component-model

Repository for design and specification of the Component Model
Other
946 stars 79 forks source link

Support big-endian hosts #302

Closed SoniEx2 closed 7 months ago

SoniEx2 commented 7 months ago

WebAssembly has so far pretty much refused to consider questions related to the interoperability between little-endian guest (wasm) code and a big-endian host. For example, the current approach to wasm-c-api expects values in the wasm memory to be seen as little-endian by the host, requiring an endian swap on every load and store as well as emulation of atomics using compare-and-swap/compare-and-exchange. In other words, wasm-c-api simply ignores the existence of host endianness. However, the component model is in a different position, from what we can understand of it, the component model is effectively trying to be an ABI schema, which is something much more malleable than the wasm-c-api.

We would appreciate if the component model took into consideration wabt's slightly unusual (in the wider space of big-endian-supporting wasm engines) implementation approach. In particular, it'd be really cool if the component model enabled writing host-endian-agnostic host interface implementations, like WASI, thus moving any necessary endian and/or memory layout conversions into the wasm engine and out of the WASI implementation.

lukewagner commented 7 months ago

I think you're right that the component model opens up some new opportunities to move endianness support into the runtime (and the auto-generated host bindings, e.g., by wit-bindgen).

First, just to be very clear, we can't do anything about the endianness of the core wasm being executed (which is fixed by the core wasm standard to be little-endian) or the Canonical ABI as exposed to core wasm (which must match core wasm). However, in theory host bindings generation (which takes care of all the low-level details of reading these little-endian bytes out of core wasm and presenting it in the host impl language) could indeed automatically do the necessary endian conversions so that the manually-written host code could be written in an endian-agnostic manner.

To support this, I'm not aware of anything we need to additionally do at the component-model spec level (just by clearly specifying the semantics of lifting and lowering, we've already given the host bindings generator all it needs to Do The Right Thing(tm)). Rather, I'd imagine you'd try out existing host-bindings generator tools on a big-endian platform and see if anything breaks and submit feedback on the appropriate tooling repos.

SoniEx2 commented 7 months ago

here's our attempt to explain it: https://chaos.social/@SoniEx2/111898426486802558

sunfishcode commented 7 months ago

The technique you describe there, and in https://github.com/WebAssembly/wasm-c-api/issues/180, can be done without any changes in the component-model spec. As @lukewagner says, core-wasm is little-endian, and the Canonical ABI must match that. The place to pursue component-model support for this reversed-linear-memory technique is in the execution engine and host-bindings generator you're using.

SoniEx2 commented 7 months ago

can we deprecate wasm-c-api altogether in favor of component model then, if component model is so much better for resolving host-guest endian mismatch issues?

sunfishcode commented 7 months ago

The component model isn't a replacement for wasm-c-api per se. In theory an API in the spirit of wasm-c-api that's organized around component-model concepts could be built, but I'm not aware of such a thing today.

rossberg commented 7 months ago

The C API is an interface for embedders of a Wasm engine. That's a completely different layer than where the component model operates. They have rather little to do with each other.

SoniEx2 commented 7 months ago

well we want

  1. an API between guest and host (whether embedder or just host modules)
  2. that acknowledges the specific needs and challenges of running wasm on big-endian

and we can't seem to get anyone to let us have one.

rossberg commented 7 months ago

we can't seem to get anyone to let us have one.

The current C API is fine on BE when combined with available C/C++ libraries for dealing with endianess conversions. I've yet to see an alternative that actually is better and does not break basic assumptions.

SoniEx2 commented 7 months ago

the current C API doesn't let us beat wasmtime in highly contended lock-free workloads.

we can use regular fetch add while wasmtime is stuck with much slower compare and swap. but we can't support C API like this.

maybe ditching standards for performance is okay tho. not entirely sure.