WebAssembly / wasm-c-api

Wasm C API prototype
Apache License 2.0
534 stars 77 forks source link

Type constructors should have a `wasm_engine_t*` parameter #190

Open fitzgen opened 7 months ago

fitzgen commented 7 months ago

In general, runtimes will deduplicate and canonicalize function types so that a signature can have a single engine-wide ID that can be used when type checking indirect calls across modules (or user functions) that defined the function types separately.

With the typed function references and GC proposals, this canonicalization becomes mandatory, rather than just an optimization (unless you want exponential runtime type checks). And not just for function types but structs and arrays as well.

Therefore, all type constructors should be given an engine as the context within which this canonicalization can happen.

Without being given an engine context, runtimes will either be forced to have global state, which is far from ideal, or to lazily canonicalize as types are used which adds performance cliffs to various operations based on whether that lazy canonicalization has already happened or not.

rossberg commented 7 months ago

That is an interesting point. A few questions/comments though.

I think you meant a store parameter, not an engine parameter, as the store is the unit of isolation for all allocated state in the API. If you use an engine parameter, you'd make it global state as well, so that'd be kind of pointless.

On the other hand, the state here is merely a memoisation cache for immutable object construction, so somewhat less problematic than other forms of state. But I can see why you'd want to be able to dispose it with other allocated state.

As a nit: canonicalisation is an implementation technique. Technically, the required semantics can be implemented entirely without it, just not efficiently. FWIW, the alternative would be a linear tree comparison, not an exponential algorithm; e.g. the reference interpreter does just that. In principle, an engine could also memoise this comparison, in which case it could be close to amortised constant time (but indeed, individual operations would have unpredictable performance then).

syrusakbary commented 7 months ago

Note: as part of the process it may be useful as well to change the signature of the wasm_module_new from wasm_store_t to wasm_engine_t

rossberg commented 7 months ago

@syrusakbary, I don't see how that would make sense, care to elaborate?

fitzgen commented 7 months ago

I think you meant a store parameter, not an engine parameter, as the store is the unit of isolation for all allocated state in the API. If you use an engine parameter, you'd make it global state as well, so that'd be kind of pointless.

At least in Wasmtime, it is expected that stores are created and torn down frequently (eg for every request in a serverless application) and so we do type canonicalization at the engine level rather than store level to amortize its cost. (Indeed wasmtime::Module::new takes an engine parameter, not a store parameter.) Also, there can be multiple Engines created per process with different configurations, so engines aren't global.

That said, taking a store in the Wasm C API wouldn't be the end of the world, and would be an improvement over taking zero context, because a store holds a reference to its engine, so we could internally implement the C API in terms of our Rust API without global state.

rossberg commented 7 months ago

Right, taking a store gives engines the choice to implement it per store or engine-global.

For example, IIRC, Module::new would be unimplementable in V8 if it did not take a store parameter.

(Also, I don't really understand what an engine parameter achieves over no parameter, at least for types.)

fitzgen commented 7 months ago

(Also, I don't really understand what an engine parameter achieves over no parameter, at least for types.)

Happy to add a store parameter since that gives the most implementation flexibility, as previously discussed, but to help clarify your misunderstanding:

Engines are not global for Wasmtime. You can have multiple engines in a process. We canonicalize types within the context of an Engine. Therefore, taking an engine parameter would allow us to avoid some global variable somewhere that all types are canonicalized into and must be synchronized upon regardless whether the type being created will ever be used in that particular engine.

We can get these same benefits via a store parameter (modulo users not realizing that the types they create with a store parameter are valid for the whole engine lifetime, rather than just the store's lifetime, and re-creating/re-canonicalizing their types for every store they create for every HTTP request their serverless application receives, for example) so the point is a bit moot, but I hope that clarifies things for you.

rossberg commented 7 months ago

I see, that makes sense, thanks for the clarification.

Adding a store parameter seems like the right way forward then. Except that I'm not sure what a good migration plan would be, however, since this would obviously break existing clients.

Btw, canonicalisation was always necessary for function types, even before the GC proposal. How was this handled so far?

fitzgen commented 7 months ago

Btw, canonicalisation was always necessary for function types, even before the GC proposal. How was this handled so far?

We would re-canonicalize/re-register the type on use, which is not ideal but also pretty simple in MVP Wasm.

fitzgen commented 7 months ago

Except that I'm not sure what a good migration plan would be, however, since this would obviously break existing clients.

Does this repo have an associated semver? We could increment it such that it was a breaking change.

If not, it should probably gain one.

rossberg commented 7 months ago

There is no versioning scheme so far. Initially I was hoping we can avoid breaking changes, following the general Wasm strategy.

Also, this API could generally use an update to support the new Wasm 3 features like richer reference types and GC properly. To be honest, I haven't had time to work on it at all over the past few years, and probably won't any time soon. If others feel like contributing, though, I'm happy to review PRs. :)