WebAssembly / WASI

WebAssembly System Interface
Other
4.93k stars 256 forks source link

Long-term support for WASIp1 in the toolchain #595

Closed loganek closed 6 months ago

loganek commented 7 months ago

Do apologize in advance if this is not the right repository for this issue; happy to move it somewhere else I added an item to the agenda for the next WASI meeting to discuss this topic: https://github.com/WebAssembly/meetings/pull/1543

Context

There are teams, including mine, that support millions of devices running WebAssembly today. The software on these devices is partially updatable. The host native code, including the WASM runtime, is baked into the device's firmware and is either not updatable or can be updated only rarely. The other part is a WebAssembly code running on that runtime, which can be frequently updated. Currently, all of our devices, as well as those of other teams, run the WebAssembly Micro Runtime (WAMR) with only WASIp1 support, and we'll need to support them for an extended period (likely 5+ years). This means that the WASM binary (which we can and want to update frequently) must only use WASIp1 interfaces since the runtime won't support WASIp2 and subsequent releases.

We aim to leverage the most recent version of the toolchain, not only for bug fixes or potential performance improvements but also for new features (primarily memory64, which should have complete support in WAMR very soon, and exception handling). However, as the community is pushing towards WASIp2 and the Component Model, which are not backward compatible with WASIp1, finding a solution to this problem is not straightforward for our team and others in a similar situation.

To address this issue, I am currently considering several options. Some of these options involve moving away from WASI, but they are beyond the scope of our current discussion. The relevant options that enable us to continue using the standard tooling (mainly WASI libc/WASI SDK, but also Rust compiler and others) while addressing the compatibility concerns are as follows:

  1. Support both WASIp1 and WASIp2 in WASI-libc and other tools
  2. Provide WASIp2 → WASIp1 adapter

Support both WASIp1 and WASIp2 in WASI-libc and other tools

The approach has been briefly described in https://github.com/WebAssembly/wasi-libc/pull/476/files, but it was defined as a solution for a "transition" period to enable teams to smoothly migrate from WASIp1 to WASIp2. However, our team does not have a path to transition to WASIp2 at all (at least not in the next few years). Therefore, the proposed "temporary" approach could potentially be used as a permanent solution for our case.

A major disadvantage of this approach is the increased code complexity, which might affect the development of further WASI versions. Looking at the code written for WASIp2 in WASI libc so far, the changes for different WASI versions could be moved to separate files to avoid conflicts and allow for almost independent development of WASIp1 and WASIp2. While having preprocessor directives (#ifdefs) for different versions of WASI may be unavoidable, their impact can be minimized by moving most of the version-specific code to separate files and enabling them conditionally in the build script.

To address these concerns, we propose setting up a continuous integration (CI) system and running a set of tests as part of it to ensure the functionality of WASIp1. We can create a formal group of contributors interested in maintaining WASIp1 and agree on service-level agreements (SLAs) to fix any blocking issues. This group can also support developers focused on WASIp2+ development with any changes that are affected by the existence of WASIp1. By establishing a dedicated CI system and a formal group of contributors focused on WASIp1 maintenance, we can ensure the continued support and stability of WASIp1 while allowing the community to move forward with the development of newer WASI versions.

There has also been pushback from the community about adding new features to WASIp1. As mentioned earlier, we would like to have memory64 support soon, as well as exception handling and potentially other features in the future. I understand the feedback that WASIp1 should be frozen and no longer extended. At the same time, we also need to consider the existing systems already running in production and the business problems that teams must solve today, and come up with reasonable trade-offs. It's important to note that we are not proposing to update the WASI standard itself (although I know @woodsmc had ideas to extend WASIp1 further, but that's topic for a separate discussion), but rather the tooling around it.

Implement WASIp2 → WASIp1 adapter

Another alternative we are considering is to build an adapter that translates (a subset of) the WASIp2 ABI to WASIp1. With this approach, we will use the "wasip2" target in the toolchain but use "wasm-ld" as the linker (instead of the default "wasm-component-ld") so that the output binary is a WASM core module with the WASIp2 core ABI. The adapter will be implemented as a tiny WASM library that will be linked to the WASIp2 binary and implement undefined symbols.

image

This adapter approach would potentially allow us to completely remove WASIp1 from our systems. It also helps address any potential backward incompatibilities between different WASI releases (as long as it's possible to convert calls from one version to another). I wrote a small prototype for the adapter here: https://github.com/loganek/wasi-snapshot-preview2-to-preview1-adapter/tree/main/wsp2_to_wsp1_adapter However, there are a few concerns with this approach:

Potential performance impact

C stdlib functions rarely return pointers that’s been allocated by the function itself. The common pattern is to let the caller to provide the buffer and its’ size, e.g.:

size_t fread( void  *buffer, size_t size, size_t count, FILE  *stream );

Where buffer is a caller-provided, already allocated buffer, and size and count define it’s capacity. A common practice in WIT (and what was already defined for the read function in the WASI-IO proposal) is to return the buffer, e.g.:

read: func(
  /// The maximum number of bytes to read
  len: u64
) → result<list<u8>, stream-error>;

This means the native implementation of the adapter (or runtime) must allocate a memory (using WASM allocator, it’s done using the cabi_realloc WASM export) and return a pointer from the function. This adds additional overhead because:

  1. A new memory must be allocated (even though user of the libc interface already provided the buffer)
  2. The memory must be copied from the adapter-allocated buffer to the caller-provided buffer.

To remove this inefficiency we could provide a custom implementation of the allocator exported to adapter (cabi_realloc). The allocator will have a global thread-local state that will allow libc to set a custom buffer (or a list of buffers), and pointer to that buffer will be returned on the next cabi_realloc call - that way we’ll pass a caller-provided buffer all the way down to the runtime.

image (1)

Because the code of the allocator is rather small, it could likely be inlined to avoid any unnecessary calls and affect the performance.

A prototype of the allocator is here: https://github.com/loganek/wasi-libc/commit/015fa0a64ab6df9d016990cc6b36782172657324. We also have an example usage of that in the adapter’s prototype: https://github.com/loganek/wasi-snapshot-preview2-to-preview1-adapter/blob/main/wsp2_to_wsp1_adapter/wamr/socket.c#L144

Identify accidental use of WASIp2 functions

One major concern with the adapter approach is that WASIp1 is a subset of WASIp2 interfaces. Even after linking to the adapter, the resulting binary may still contain references to WASIp2 interfaces due to the way certain functions are extended in WASI libc for WASIp2. For example, the close() function now calls a WASIp2-specific function (wasi:sockets/udp@0.2.0[resource-drop]udp-socket), even if UDP is not used, as the compiler cannot determine file descriptor types at compile-time.

Additionally, since the code will be compiled using the WASIp2 toolchain, developers targeting WASIp1 may accidentally use features not available in WASIp1. While code reviews and testing on WASIp1 runtimes can mitigate this risk, they are not as effective as checking for the introduction of new imports in the binary.

Another related problem can happen when the WASIp2 function that can’t be emulated with WASIp1 is on the call path to another function that can be emulated. For example, we might have a function foo() in WASI libc which is implemented as:

void foo() {
  x(); // WASIp2 function that can't be emulated using WASIp1 interfaces
  y(); // WASIp2 function that can be emulated using WASIp1 interfaces
}

While this could be worked around by providing a dummy implementation of x() in the adapter, the feasibility depends on the semantics of foo(), x(), and y(). However, I’m not sure if such patterns will be observed in the actual implementation of WASI libc.

Summary

While the adapter approach allows for reducing maintenance overhead and potentially opens up opportunities to use standard tooling with non-standard runtime extensions (e.g., sockets in WAMR), the concern of accidental usage of WASIp2 functions is a challenge that needs to be addressed. At the moment, I don't see a strong mitigation for this issue (mentioned code reviews and automated tests are good, but each team would have to write their own set of test cases for their usecase), but I remain open to exploring potential solutions through further discussion and collaboration with the community. Until a suitable solution emerges, my preference is to maintain WASIp1 support in the tooling and extend it with new features as long as there are contributors willing to maintain it and support any potential disruptions for WASIp2+ development. This approach ensures compatibility and support for existing systems while allowing for the gradual adoption of newer WASI versions.

I'm very interested in hearing the community's thoughts on this matter. I'm open to comments or other ideas that could potentially address the problem. I'm also keen to understand the perspectives of the tooling maintainers, as their insights will be valuable in shaping the way forward.

ricochet commented 7 months ago

I am separating my responses into multiple comments to make it easier to respond to individual ideas.

Support both WASIp1 and WASIp2 in WASI-libc and other tools

Maintainers (including @sunfishcode ) are aligned with this plan and agree we should continue supporting WASIP1. Your proposal to break apart support into multiple files to avoid conflicts between wasip1 and wasip2+ is a good idea to make the maintenance easier. Bringing the resources, maintainers, and willingness to support the CI makes this a solid and welcome plan. Thank you so much!

ricochet commented 7 months ago

A WASIp2 → WASIp1 adapter

This adapter is something many of us want to see and believe is technically feasible.

For the concerns around accidental usage of new API's, this is also tooling that my team and others need for the wasip2 ecosystem to handle scenarios like using features not yet available in our host, like 0.2.1 vs 0.2.2 additions. This is absolutely something we can coordinate around building. One of the capabillities we plan to use is in this issue.

We also plan to enhance the component model to support this adapter use-case going forward to avoid the performance trade-off as documented in this issue as well as the proposed change in https://github.com/WebAssembly/component-model/issues/314. @sunfishcode is also investigating adding optional support for caller-supplied buffers to the canonical ABI. Additional changes and proposals are encouraged and we'd love to keep the feedback loop open for this use-case.

ricochet commented 7 months ago

I agree that the wasip1 proposal should remain frozen; the established practice of providing host-specific additional APIs for content to use continues to be the right way to address needs that go beyond what wasip1 provides.

For wasip2, extensibility is part of the modular design which will make these types of issues much easier to adapt with forward and backward compatibility.

One thing I don't quite understand is how these new features/extensions are meant to be delivered in a scenario where the runtime cannot be updated. Is it that the current version on these devices already includes the support that you need?

ricochet commented 7 months ago

My feeling is that we should pursue and support both options as outlined.

woodsmc commented 7 months ago

One thing I don't quite understand is how these new features/extensions are meant to be delivered in a scenario where the runtime cannot be updated. Is it that the current version on these devices already includes the support that you need?

Perhaps I can supply some nuance...
There are two scenarios here:

  1. New Devices
  2. Existing Devices already in customer hands

New Devices These can ship with an updated runtime. They often do. This is driven by sector specific competition and commercial pressure.

Existing Devices Already in Customer Hands
Updating a runtime is not impossible, and it will vary from product to product. It is however exceedingly more difficult than in the cloud space. The updates are pushed out by the product owner, but applying the update is up to the user. Not everyone updates at the same time, mixed landscapes exist. Additionally any update that is pushed out can lead to broken updates and "bricked" user devices, these devices can't be rolled back, and can lead to RTO. There is, therefore a considerable reluctance to update. This impedance is measurably more than that witnessed in the cloud / data center world, where rolling updates to specific data centers are applied, and then can be quickly rolled back if an error is detected. The actual update frequency will be commercially bound, due to the risks involved.

This leads naturally to the next question : Why not just switch to preview 2? Risk and Availability: Preview 1 works, Preview 2 is not available for the current preferred runtime (WAMR) and it hasn't been validated. It will take some time before this happens, and it actually may not even happen within the Preview 2 time frame, it may need Preview 3.

We're kinda stuck. There will be a period of time from now, until we can consider moving to the latest WASI standard. Which leads to the next question...


I apologize in advance, as Marcin had noted that this is a separate point, but since it was mentioned again in the comments, I want to just chip in with some clarification.

So, what do we do in the mean time? - where we need to innovate (due to competition in our respective sectors) and can not use Preview 2, and can not propose standard new features for the software stack we currently operate?

This is where, you folks know, I've a different opinion to Bailey:

I agree that the wasip1 proposal should remain frozen; the established practice of providing host-specific additional APIs for content to use continues to be the right way to address needs that go beyond what wasip1 provides.

Ok, so if we follow this guidance, then every WASI Preview 1 runtime out there today will continue to fix issues and add features in there own unique way. This will naturally promote fragmentation in the WASI P1 runtime space. But, it also encourages continued investment into specific runtimes. As I mentioned above, commercial pressure forces this investment and innovation to happen. The pressure will add features that users of their respective runtimes will come to rely on.

All of this investment is going to continue to occur until the future WASI standard becomes stable and tested, and ready to use. For some of the embedded community, full adoption may not even be commercially possible to consider for another 4 years. At this point the latest version of WASI will have non of the features developed by the runtime in the intervening period. So, switching to the latest WASI version would mean a reduction in features, or a very expensive process of re-implementing features in the last WASI ABI compatible way.

The net effect of the guidance is therefore the (unintended) construction of a technical and commercial barrier to future WASI adoption.

That is why, I've been advocating for not freezing Preview 1, just yet. Because, if we allow innovation in P1 to be supported, by saying that new features must be supported in preview 2, but can also be supported in preview 1, we can encourage consolidation, and common solutions for products deployed today. We can also work toward ensuring that those solutions work for P2, P3 and V1. In this case, while there may be an ABI break, there will be source compatibility, and feature parity.

At some point, of course, preview 1 could be frozen, but doing it now, particularly when preview 2 is so new, and where we do not have an immutable future ABI could be a bit premature. I feel that delaying this freeze would mitigate the risk I've tried to articulate above. I feel that overall, this will lead to a better, less fragmented future for WASI.

loganek commented 7 months ago

Thanks @ricochet / @woodsmc !

Maintainers (including @sunfishcode ) are aligned with this plan and agree we should continue supporting WASIP1.

Sounds good. Once we'll have a formal agreement, I'll work with folks who're interested in that on some of the details.

For the concerns around accidental usage of new API's, this is also tooling that my team and others need for the wasip2 ecosystem to handle scenarios like using features not yet available in our host, like 0.2.1 vs 0.2.2 additions. This is absolutely something we can coordinate around building

That's a great news. I'd be keen to know what's the plan for that and how can we work on that together. Perhaps we could have a call to agree on the next steps.

I agree that the wasip1 proposal should remain frozen

I'm actually not suggesting it should; it's just that at that stage we (my team) doesn't have any business case for pushing that forward. I'll let @woodsmc to drive that further though :) (just a note: if we decide to open up WASIp1 for extensions, I'd revisit the wasi-threads inclusion).

One thing I don't quite understand is how these new features/extensions are meant to be delivered in a scenario where the runtime cannot be updated. Is it that the current version on these devices already includes the support that you need?

You're right; features like exceptions or memory64 will not be available for non-updatable devices. This is a sad reality that we'll have to accept and live without it. There's a few reasons we don't want to deploy runtimes with preview2 support for a new devices:

My feeling is that we should pursue and support both options as outlined.

I think I agree with that. I'm hoping though that with the right validation tooling mentioned in https://github.com/WebAssembly/WASI/issues/595#issuecomment-2073238356 we'll be able to remove the WASIp1 support in the toolchains at some point of time.

lukewagner commented 7 months ago

I think there's two complementary ways we can support the gradual transition from wasip1 to wasip2:

  1. Keeping toolchains producing wasip1 core modules that can run on unmodified wasip1 runtimes.
  2. Enabling existing core wasm runtimes to support new functionality (beyond wasip1) by just adding new core function imports (and not having to implement the full component model spec).

It sounds like everyone already agrees on (1), and there's multiple good approaches being considered.

For (2), the arguments that @loganek and @woodsmc are making make sense to me for why this is important: incrementally adding new core host imports is a much smaller and more incrementally-shippable unit of work than implementing and validating the whole component model. However, I also think we risk creating confusion and fragmentation by simply "branching" WASI and making ad hoc additions to wasip1, so I also agree with @ricochet that we want to keep wasip1 "frozen".

I think the way to address both these concerns is a sort of midpoint solution where we:

Just as a sketch (name to be bikeshedded): the new target could be named wasit2, where the t stands for "transitional". And thus if you build with wasi-sdk with --target=wasm32-wasit2 you'd get a core .wasm and any core wasm runtime can run this core .wasm by implementing a set of core function imports whose semantics are specified by the Canonical ABI. When wasm64 is ready, --target=wasm64-wasit2 would derive the obvious 64-bit Canonical ABI. There's a number of other details to work out, but chatting with @sunfishcode, it seems like most of the core wasm compilation toolchain can not have to care about the difference between wasip2 and wasit2 -- it's mostly just whether the final link step uses wasm-ld or wasm-component-ld. That also means we can trivially "componentize" any wasit2-produced binaries to run them on full wasip2 runtimes.

Because I think wasit2 would also be useful to embedded devices where I know there has been some concern about cabi_realloc being a source of memory fragmentation over time, we could also pull in the new "caller-supplied buffer" idea that @sunfishcode has been working on. Without getting into the details (which probably belong in a separate issue), the net result could be that wasit2 could have no cabi_realloc at all by construction, which I expect would go a long way toward addressing those concerns.

There's a lot more details to work out, but I just wanted to sketch this wasit2 idea here to see if something of this rough shape is in the right direction.

woodsmc commented 6 months ago

Because I think wasit2 would also be useful to embedded devices where I know there has been some concern about cabi_realloc being a source of memory fragmentation over time, we could also pull in the new "caller-supplied buffer" idea that @sunfishcode has been working on. Without getting into the details (which probably belong in a separate issue), the net result could be that wasit2 could have no cabi_realloc at all by construction, which I expect would go a long way toward addressing those concerns.

Ohh, wasit2 ... @sunfishcode @lukewagner this sounds really interesting, happy to help with the sketching. cc: @ttrenner. Have you kicked off a new issue or another venue where we could continue this idea?

lukewagner commented 6 months ago

@woodsmc Great! There's not any separate issue/discussion yet; this is just an early idea that seemed quite relevant to this issue, so I thought I'd take the temperature here first. But if folks are positive here, I suppose a good next step could be to discuss the technical details further in a PR adding an .md describing wasit2.

ttrenner commented 6 months ago

I think the way to address both these concerns is a sort of midpoint solution where we:

  • Define new functionality in WIT, starting with 0.2 and extending onwards with new interfaces and worlds
  • Define a new "target" for wasi-sdk/wasi-libc and other producer toolchains that says "produce a single core module that imports core functions derived via the Canonical ABI from WIT"

@lukewagner Thank you for that idea. This could be a direction that is similar to ideas that I got within a chat with @xwang98. The idea could be to use WIT for generation of different targets: If I understood correctly, then this could indeed help to do the transitions, keeping folks on board and also finding mitigations for challenges that come up during the journey to WASI V1.

cpetig commented 6 months ago

I really like the idea of a p1 binary with p2 imports as all the old tools (wabt, wamr) continue to work on it.

And since p1 likely doesn't support multi-memory, caller provided buffers are a logical choice.

I can even imagine linking multiple t2 components either ahead of time (monolithic binary) or via shared object "loading". The source code would be identical between p2 and t2, but of course the composing tools would be different.

loganek commented 6 months ago

Hi all, This is the summary of the discussion in this ticket and in the WASI CG meeting we had last week.

The community acknowledges the need of supporting preview1 for the extended period of time, and any further contributions to the tooling (e.g. EH or memory64) will be accepted by the maintainers as long as there will be a support / resources from interested teams. I'll work with maintainers to agree on the next steps.

In parallel we'll continue looking into the adapter approach; @ricochet mentioned there were some thoughts on that so I'll reach out, discuss the plan and share with the community. Once the adapter is in usable state, we'll likely abandon the support of preview1 in the tooling.

In any case, please contact me on Zulip or email if you're interested in the work and/or willing to support the initiative.

I'm closing the ticket as the main issue has been addressed. There were discussions about extending preview1 or having wasit2 target. I suggest we'll have a separate ticket for that - @woodsmc are you going to open one to discuss the extensions?

Thanks all for the discussion