WebAssembly / WASI

WebAssembly System Interface
Other
4.72k stars 240 forks source link

Rebase the Phase Process description on the CG's current process #549

Closed sunfishcode closed 10 months ago

sunfishcode commented 11 months ago

The CG Phase Process document has recently split out the entry requirements for each stage from the activities that happen within each stage, fixing an ambiguity about what happens before a stage and what happens within a stage. It also contains a number of generally useful updates.

This PR updates the WASI Phase Process using wording derived from the CG Phase Process, adapting it to meet WASI's needs. The resulting process is roughly the same as the existing process, however I've made it more specific in a few areas:

abrown commented 11 months ago

From the PR, for phase 2:

A wit description of the API exists.

Several recent conversations have made me pause and consider what this change might do to the ecosystem. Not all of the implications of this change are clear to everyone and I think we should make them clear here to avoid discontent in the future. To my eyes, the WebAssembly ecosystem is already quite fractured (have you tried building a module that runs on a standalone engine AND the web?) and one of the bright spots was the agreement from several different corners of the web on the WASI standard. I've heard the concern that binding WASI to WIT — and implicitly to the component model — might cause fractures (e.g., other standards or non-standard APIs).

Let me propose some questions so that someone can clarify where all this is headed. I'll do it from various perspectives:

  1. For the engine implementers: do I have to implement the component model to continue supporting WASI?

  2. For WASI API designers: what will I do if I can't express my API in WIT?

  3. For WASI API users: will code that I compiled to use WITX-defined WASI APIs (not just preview1) work in component model engines? What about the reverse — code using WIT-defined APIs in non-component model engines?

  4. For performance optimization: will every call across the WIT boundary incur data copy overhead?

I think I know the answer to some of these questions and some of them have been discussed in the past (cc: @sunfishcode, @pchickey) but I think making all of this explicit will be helpful.

xwang98 commented 11 months ago

We wish the component model is not mandatory for WASI. Component model introduce complexity and additional resource requirement, and it is not always wanted. For exmple the footprint is a hard requirement for embedded and IoT usages, we wish we can still use wasi for these domains.

yamt commented 11 months ago

my understanding is that having a WIT-defined interface doesn't imply to require component-model support. you can use the corresponding core-wasm level abi directly if you want.

my impression is that WIT-defined interfaces are often less efficient if you compare them with an abi based on bare linear memory pointers though.

woodsmc commented 11 months ago

I agree, with @abrown here answers to these questions would be helpful for many in the ecosystem. To @abrown's observation, the answers to these questions are often known inside the team working on the Component Model. It's just a change in perspective; approaching the Component Model from the outside in.

I know many folks may be worried about stating known limitations. But I think it's an opportunity for community engagement. For instance, I'm aware that there is scope of performance improvements in how the marshalling between components work. For those enthusiastic about the component model it points out an area where contributions and community focus may be welcome.

On the general topic of performance and overhead; To @xwang98's point, we have similar concerns, but feel that getting some hard data on the performance and impact would be great. It would allow everyone to assess it's suitability for their particular use cases. We'd love to see some performance metrics / data. But being realistic - I also know this isn't going to be possible until after preview 2 is released. Again, stating this as an area of contribution or focus, following the release of Preview 2 would be great.

Regarding the engine impact, would it be possible to get some some engineering guidance from those that have implemented the component model in an engine already - I'm guessing this may be the Wasmtime team? This would help address the concerns of other runtimes and provide guidance on the suitability of the component model for various domains. Maybe a future blog post or interview? - just a thought.

tschneidereit commented 11 months ago

I think I know the answer to some of these questions and some of them have been discussed in the past (cc: @sunfishcode, @pchickey) but I think making all of this explicit will be helpful.

Thank you for making raising these questions this explicitly. I agree that it makes sense to explicitly work through them for other interested parties, so I'll walk through them in way more detail than you personally would need.

(Note: this got very long, and I apologize. I'd highly recommend reading the first section, and then those Q&A entries you're interested in.)

One thing I want to emphasize is that

nothing has changed about any of this in a long time!

The particular phase 2 requirement you mention ("A wit description of the API exists") has been in place since 2021, and explicitly and fairly prominently mentioned in the main README since early February 2022.

The more fundamental approach of basing WASI on top of another standard to define the ABI is even older: the very first WASI overview document from when WASI was announced in April 2019 mentions WASI gaining support for "Host Bindings". Since then, the Host Bindings proposal merged with the Module Linking proposal into what's now the Component Model.

That very first WASI overview also already includes the reason for moving towards defining WASI in terms of Host Bindings: the ABI approach used by WASI Preview 1 works very well for languages like C, C++, and Rust, which use linear memory. It doesn't work well at all for languages like Kotlin and Dart, which use Wasm GC. That's because it fundamentally assumes that there is a linear memory heap to read values from and write them into.

With the Component Model's approach of defining the ABI in terms of canonical lowering and lifting operations for each data type, we can directly and efficiently support languages using GC by defining lifting and lowering operations that operate on GC objects.

Based on this, another thing I want to highlight is that

not using an approach like the Component Model's WIT-based APIs means no real support for Kotlin, Dart, and other languages using Wasm GC

In summary, this PR clarifies some aspects of the process, but doesn't in any way represent a change in direction.

@xwang98 the above means that the direction WASI is on hasn't changed, and that being based on what is now the Component Model has been part of the design since the very beginning. This approach has since been confirmed a number of times, and changing it now would not only mean discarding many person-years of work, but also require coming up with a different approach to at least some of the goals. E.g. I'm sure you agree that not supporting languages using Wasm GC isn't really an option.

With that all out of the way, I'll try to answer @abrown's questions below, as well as some of the concerns @woodsmc raised.


Q&A

  1. For the engine implementers: do I have to implement the component model to continue supporting WASI?

As @yamt says: no, that's not required to support content that'd otherwise use a WASI Preview1-style ABI using WITX. WebAssembly Components define a new binary format, but to support content that works in roughly the same way as Wasm core modules targeting Preview1, all that's needed is the ability to "unwrap" the core modules contained in these components, and then communicating with those via the canonical ABI. This ABI is roughly equivalent to the ABI witx defines and which is used in WASI Preview1. This approach is e.g. taken by the JCO toolchain to support running components in JS engines such as browsers or Node.js which (for now) lack native support.

Another strong proof that this works is that we have multiple toolchains able to produce Components, despite none of them actually emitting the new binary format. Instead, they all emit core Wasm modules with the Component Model's ABI, which are then turned into Component binaries using external tooling.

  1. For WASI API designers: what will I do if I can't express my API in WIT?

WASI has as part of its fundamental design goals extremely high security standards, the ability to treat all languages as first-class citizens—instead of just ones that behave like C—and the ability to make all APIs fully virtualizable. (I.e., to enable all APIs to be implemented in WebAssembly and with the same privileges of all other content.)

Based on these goals, WASI (by virtue of being based on the Component Model) introduces two major constraints that can impose limitations on some API designs:

  1. full encapsulation of a Component's internals, with the WIT-defined API it exposes being the only way to interact with it
  2. a limitation to host calls that can be implemented in content instead of just the host

I don't think that any of this means that there is any kind of functionality that WIT fundamentally can't expose an API for. It's true however that some API designs won't work, and will need other approaches.

I think the most important reason for that is the need for all APIs to be language-agnostic though. E.g. shared-everything multi-threading doesn't really mean the same thing for languages using linear memory and those using Wasm GC. Features of that nature are I think best handled as Component-internal, much like Wasm GC itself, exception handling, or stack switching.

  1. For WASI API users: will code that I compiled to use WITX-defined WASI APIs (not just preview1) work in component model engines? What about the reverse — code using WIT-defined APIs in non-component model engines?

Content will run in those runtimes that support the ABI and binary format it uses, and that implement the APIs it requires. There's absolutely nothing stopping runtimes from supporting all kinds of different ABIs and binary formats, so the answer to the first question is definitely "yes".

Additionally, it's possible to fully support WASI Preview1-targeting content inside the component model. The Wasmtime project has been working on an adapter for just that, which should enable all Preview2-supporting runtimes to support Preview1 without maintaining multiple WASI implementations in parallel.

Since WASI Preview2 introduces a whole host of additional functionality, such as the wasi-http API, it's not really possible to support Preview2 in runtimes that don't support these interfaces. But as mentioned above, runtimes can choose to only implement support for the canonical ABI and running single modules, instead of supporting linked Components as well.

  1. For performance optimization: will every call across the WIT boundary incur data copy overhead?

Not in any way that's not there for WITX as well, and in fact inherent to WebAssembly in general. The fact that content runs inside a tight sandbox means that one can't just expose arbitrary regions of memory and operate on it without copying.

The Component Model does however introduce a concept that significantly reduces this overhead: resources and resource handles. These can represent a large collection of values without having to copy them, and enables operating on them via associated functions/methods.


@woodsmc, I hope I was able to address some of your concerns with the above, but I'm very happy to discuss things in more detail (though maybe not as part of a PR that's not actually about any of this 😉)

woodsmc commented 11 months ago

Thanks @tschneidereit .

Thank you for making raising these questions this explicitly. I agree that it makes sense to explicitly work through them for other interested parties, so I'll walk through them in way more detail than you personally would need.

Awesome; I know this is a perspective shift - rather than addressing our own internal community, we are answering the anticipated questions of others. I love this. Mainly because I get asked these types of questions from my organization frequently. Having them explicitly addressed lowers the barrier for entry and aids with technology adoption.

It would be amazing to have these points addressed explicitly in some "published" form, an addition to this document, or another. So thanks again!

RE: Performance / other issues - delighted to take it out of the PR. I'll reach out. Thank you.

yamt commented 10 months ago

This ABI is roughly equivalent to the ABI witx defines and which is used in WASI Preview1.

actually, it's sometimes considerably more expensive as it involves malloc.

yamt commented 10 months ago

RE: Performance / other issues - delighted to take it out of the PR. I'll reach out. Thank you.

if you moved the discussion to elsewhere, give me a pointer to the new place. thank you.

sunfishcode commented 10 months ago

@yamt The places where the ABI does a malloc fall into two categories: there are some malloc calls that we haven't yet optimized yet but will, and there are some malloc calls in areas that have no witx equivalent.

yamt commented 10 months ago

@yamt The places where the ABI does a malloc fall into two categories: there are some malloc calls that we haven't yet optimized yet but will, and there are some malloc calls in areas that have no witx equivalent.

which category does eg. fd_read (besides malloc, it lacks of iov) fall into?

yamt commented 10 months ago

@yamt The places where the ABI does a malloc fall into two categories: there are some malloc calls that we haven't yet optimized yet but will, and there are some malloc calls in areas that have no witx equivalent.

when you say "haven't yet optimized yet but will", do you mean adapter functions?

sunfishcode commented 10 months ago

which category does eg. fd_read (besides malloc, it lacks of iov) fall into?

We haven't optimized it yet.

when you say "haven't yet optimized yet but will", do you mean adapter functions?

The malloc call can be optimized away by changing how the bindings are generated. iov functionality would require adding the feature to the canonical ABI spec, but it's doable.

yamt commented 10 months ago

which category does eg. fd_read (besides malloc, it lacks of iov) fall into?

We haven't optimized it yet.

when you say "haven't yet optimized yet but will", do you mean adapter functions?

The malloc call can be optimized away by changing how the bindings are generated. iov functionality would require adding the feature to the canonical ABI spec, but it's doable.

ok.

do you have any idea when/if such optimizations can be made?

at core wasm level, how do such optimized versions look like? will they have different import names from the current version? i guess it's difficult to distinguish optimized and current versions by core func type alone.

penzn commented 10 months ago

Recap of some of the discussions that I think would be useful to share:

xwang98 commented 10 months ago

related discussion for reference: https://bytecodealliance.zulipchat.com/#narrow/stream/290350-wamr/topic/WAMR.20Open.20TSC.20meeting.20-.202023-08-29.20.209.3A00.20AM.28UTC.29

woodsmc commented 10 months ago

In the interests of visibility I wanted to share and get some feedback on two key items, which should be considered in relation to this PR, that is performance equivalence and the parallel life of WIT and WITX. Both of these help to address the concerns expressed in this thread and related discussions.

Performance Equivalence
Could we change the PR, to include a stipulation that the WASI interfaces which currently exist shouldn't see a performance degradation?... This was discussed with @lukewagner @ricochet. The performance criteria would allow for variation and innovation of the implementation, and indeed the standard going forward while ensuring no negative impact for existing solutions.

Justification / Mitigation of Concern
The WIT format is essentially a moving target at the moment. It is continuing to evolve. New concepts are already planned for introduction. An async concept has been proposed. This is an example of a primitive which is exclusive to a set of more modern languages, like Rust. async doesn't exist in C .

A concern would be these newer language specific primitives being used to define interfaces upon which existing C code is heavily dependent. A socket interface is a great example of this. If a socket interface were to be implemented using async then it would work great for Rust, but would suck for C, and other non-asyncy languages. Today, in the C world, we'd need to write a bunch of code to turn that asynchronous function into a synchronous one. This could foreseeably result in C applications under-performing on newer WASI system calls - disproportionately impacting them.

At the same time it is important not to rule out innovation, and technology advancement, both in the runtime implementation and in the realization of the WASI standards themselves. So rather than blanket rejecting of new concepts, like async, the introduction of performance check means that we can allow the implementation and standard to evolve over time, while ensuring that we don't disproportionally affect individual languages, particularly C, which is critical for Industrial and IoT based applications.

The implication of course, may be, with regard to the malloc discussion between @sunfishcode and @penzn that the adoption of a WIT implementation would be dependent on showing no regression on performance between a witx implementation and a wit implementation. Some data here to validate this may be required?

The resulting performance checks, may result in performance observations in a number of the key languages WASM targets. C, Rust, Go ? - Others ?

I know this puts an additional onus on those proposing standard changes, and that this may slow standard evolution, but this, perhaps is justified, since as the WASI ecosystem matures the cost of interface change increases, and rapid, performance impacting changes will drive away adoption in general.

The Parallel life of WIT and WITX
During the discussion between @lukewagner and @xwang98 a recommendation was made that while WIT may be used for standardization and discussion, that it should be possible, using a tool to derive witx, at least until a point in the future where we have support for WIT and the component model in a wider set of runtime implementations.

Justification / Mitigation of Concern
This provides the necessary time to allow some of the embedded customers of WASM to plan and adopt a migration strategy. While also allowing them to continue to propose, adopt, and prepare for new interfaces and proposals.

This may extend the life of WITX; but it provides an important runway for WAMR in particular. There are WAMR users with 100,000s of individual devices deployed in the field. The rate of runtime change is considerably slower in the IoT and embedded world, than in the cloud / data center environment. Aside from the practical implications of deploying the an updated runtime, there is a need to allow time for the runtime's own evolution to support the component model, and to ensure it can continue to execute the same functional payload with the existing hardware specifications (see, performance above).

A concern would also be that a move to a WIT only world at this point in the WASI journey, may result in standards only being proposed and considered by the subset of the community actively engaged in WIT compliant runtimes. As those not working on runtimes which can adopt the standards would struggle with the context necessary for active participation in the conversation, and wouldn't be able to propose standards which they themselves could implement.... at least until we've wider support...

Taken together, I think these two suggestions help to address a number of concerns which impact the embedded and IoT world. Thoughts?

penzn commented 10 months ago

My final 2¢, I don't meant to be gloomy or delay adoption of Component Model in WASI, I just think this should be shared for awareness.

I think this change, with mandating component model, as opposed to being 'just an external API', presents a qualitative change, however minor, where WASI would need some core features. There are two reasons for it in my view:

This is concerning, because so far there is not much traction in supporting component model or other core features mandated by this change in browsers (treads draft hasn't been presented yet, for example). This would create divergence between WASI and 'stock' web environment, while more convergence would be preferrable in my opinion, as there are currently some challenges with running the same code both ways.

Maybe a compromise would be to allow WIT and WITX coexist for the time being, at least this way exploration of component model can continue while maintaining backwards-compatibility with the existing WASI approach.

lukewagner commented 10 months ago

(sorry for the slow reply due to holiday + travel) Thanks for the thorough writeup @woodsmc. For my part, I agree on both your broader points and suggestions. As for how I think we could go about concretely integrating these into the docs and process:

For the ‘Performance Equivalence’ point, while I agree that our goal is to ensure that Preview 2 doesn't regress performance (after all, performance is one of the main motivating factors for using wasm in the first place), I think it’s important that we don’t choose a fragile evaluation criteria or one that prevents iteration and real-world feedback. Having worked on benchmarks for some years before, it’s surprisingly easy to write well-meaning benchmarks that completely misrepresent real-world performance and lead developers in the wrong direction. Additionally, just by nature of older code paths having received more optimization, I think it’s important to distinguish between any temporary regressions that may occur due to newness and lack of optimization vs. essential performance differences that indicate roadblocks to better performance in the future. Lastly I think, at this point in time, it's really important to ship our next iteration of WASI this year and fix any lingering performance issues in the next iteration next year. Based on all that, my suggestion is that we add a “Performance goals” subsection to preview2/README.md (added in #550) that says something to the effect of:

It is a goal and expectation that Preview 2 will not regress performance of real-world workloads compared to Preview 1. In the Preview 2 timeframe, WASI interfaces will aim to achieve the best performance they can given the current feature set of Wit and the Component Model. However, in the interest of shipping Preview 2 promptly to gather real-world feedback and inform the next iteration (Preview 3), modest regressions may need to wait until the next iteration to be fully addressed.

For the ‘Parallel Life of WIT and WITX’ point, agreed and perhaps we could add a “WITX” section to the current WitInWasi.md that describes how .witx files can be derived from .wit files according to the Canonical ABI and how wasm engines can implement single-module components using just these derived .witx files and their existing WITX machinery.

Does that sound reasonable?

lukewagner commented 10 months ago

@penzn Agreed that we should generate WITX from WIT (via the Canonical ABI) to help developers transition. However, to the wasi-threads point, I think threads are wholly independent: the problem with wasi-threads is the O(MxN) function table space usage it implies (assuming M threads and N functions) and the consequent problems for dlopen()-style dynamic linking, not the component model. If it weren't for these problems, it would be easy enough to add a thread.spawn canon built-in to the component model (there are a few already). Thus, we shouldn't consider the component model as part of the threading story -- that is orthogonal and driven by fundamental core wasm representation issues.

penzn commented 10 months ago

I think threads are wholly independent: the problem with wasi-threads is the O(MxN) function table space usage it implies (assuming M threads and N functions) and the consequent problems for dlopen()-style dynamic linking, not the component model.

Do you have a link to a discussion/notes where this was stated previously? I can see a couple prior discussions where it wasi either implied or directly said that threads and similar APIs are ultimately incompatible with Component Model. For example in https://github.com/WebAssembly/wasi-threads/issues/48#issuecomment-1644487201: These limitations also turn out to make it incompatible with the component model and WASI Preview 2, and further down. Further, function references/pointers can't be represented in component model syntax (WebAssembly/component-model#117) and those would be required for a threads-like API.

Even if the issue of threads was completely decoupled from component model, then it would be an additional instance of needing core a wasm feature in WASI, instance-per-thread multithreading is used on the Web today.

pchickey commented 10 months ago

There's been a lot of discussion on this thread, and in many other venues, on this topic. We have pushed the vote on this back a month so that these discussions can continue. We believe that all of the key concerns are resolved enough that we can hold a vote tomorrow for this PR to land. So this is a last call for tomorrow's vote - if you still have concerns that would warrant a "no" vote, can you please let us know today? Its also ok to say so at the meeting / vote itself, of course.

yamt commented 10 months ago

However, to the wasi-threads point, I think threads are wholly independent: the problem with wasi-threads is the O(MxN) function table space usage it implies (assuming M threads and N functions) and the consequent problems for dlopen()-style dynamic linking, not the component model.

component model is often used to link core instances and it in some cases needs to involve function table, thus has problems wrt wasi-threads similar to dynamic linking, doesn't it?

yamt commented 10 months ago

which category does eg. fd_read (besides malloc, it lacks of iov) fall into?

We haven't optimized it yet.

when you say "haven't yet optimized yet but will", do you mean adapter functions?

The malloc call can be optimized away by changing how the bindings are generated. iov functionality would require adding the feature to the canonical ABI spec, but it's doable.

ok.

do you have any idea when/if such optimizations can be made?

at core wasm level, how do such optimized versions look like? will they have different import names from the current version? i guess it's difficult to distinguish optimized and current versions by core func type alone.

my understanding is that such an optimization will be an ABI breaking change at core wasm level. so, if we allow to implement apis at core wasm level (w/o component model), we need some kind of versioning visible at core wasm level, like somehow changing import/export names. right?

sunfishcode commented 10 months ago

I've now added commits to #550 to add the content of the discussion about performance and witx and wit.

pchickey commented 10 months ago

This PR was approved by unanimous consent in the 9/07/23 WASI subgroup meeting https://github.com/WebAssembly/meetings/blob/main/wasi/2023/WASI-09-07.md