rustwasm / wasm-bindgen

Facilitating high-level interactions between Wasm modules and JavaScript
https://rustwasm.github.io/docs/wasm-bindgen/
Apache License 2.0
7.43k stars 1.02k forks source link

Align with the "C" ABI #3454

Open daxpedda opened 1 year ago

daxpedda commented 1 year ago

Motivation

We are currently using a non-standard ABI (there isn't really a standardized one) for Wasm exports, which currently Rust is mimicking: https://github.com/rust-lang/rust/issues/71871.

This has the big disadvantage of making the wasm32-unknown-unknown backend ABI incompatible with wasm32-wasi and wasm32-unknown-emscripten. Which in turn makes wasm-bindgen not compatible with any C/C++ dependency: https://github.com/rustwasm/wasm-bindgen/issues/2317.

Plan

  1. [x] Make wasm-bindgen follow the C ABI outlined here: https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md. Which mainly involves removing splatting.
  2. [x] Let users try it out with a compiler flag in Rust. (https://github.com/rust-lang/rust/pull/117919: -Zwasm-c-abi)
  3. [x] Figure out a migration path for users. (https://github.com/rust-lang/rust/pull/117918)
  4. [ ] Move the wasm32-unknown-unknown target to the C ABI.
  5. [ ] (optional) Introduce a flag to wasm-bindgen that uses the unstable "wasm" ABI: https://github.com/rust-lang/rust/issues/83788.

Result

It should be possible to compile Rust with wasm32-unknown-unknown while consuming Emscripten and WASI dependencies correctly, which would also partially address #3421. I know nothing about Emscripten, but WASI would require some additional hook ups to support it's API, which we should probably (help) support as well.

Downsides

There is a reason why the ABI differs, splatting is a performance improvement and the C ABI has no support for multi-value. Hopefully this can be addressed in the future and a better ABI can be agreed on. In the meantime users can use the "wasm" ABI on nightly to opt into splatting without us having to break C compatibility for everybody else.

alexcrichton commented 1 year ago

One possible solution I've thought may work in the past is that there could be something like:

pub trait IntoWasmAbi: WasmDescribe {
    type Abi1: WasmAbi;
    type Abi2: WasmAbi;

    fn into_abi(self) -> (Self::Abi1, Self::Abi2);
}

where most impls have type Abi2 = (); and extern "C" fn foo(_: ()) ends up not having any arguments in wasm (e.g. the () argument is elided). That would require a lot of plumbing in the macro, however, but should probably get the splat behavior desired here (I think, at least for strings, I'm not sure for optional things)

temeddix commented 8 months ago

Is this fixed with #3595? Or are there additional steps needed?

daxpedda commented 8 months ago

See OP. I believe the next step to take is migrating rustc to the new ABI.

ranile commented 8 months ago

@Liamolucko did https://github.com/rustwasm/wasm-bindgen/pull/3595 check off the first part of the plan mentioned in this issue? If so, can you check it off in the issue as well

Liamolucko commented 8 months ago

@Liamolucko did https://github.com/rustwasm/wasm-bindgen/pull/3595 check off the first part of the plan mentioned in this issue? If so, can you check it off in the issue as well

Yes.

temeddix commented 8 months ago

I'm not an expert here, please excuse me for basic questions.

The next step is written as "Move the wasm32-unknown-unknown target to the C ABI.". Does this mean that new C ABIs produced by wasm-bindgen when running cargo build are going to change? Would this be a small work or a major rewrite?

daxpedda commented 8 months ago

Does this mean that new C ABIs produced by wasm-bindgen when running cargo build are going to change?

This is a change that has to be done in rustc, see https://github.com/rust-lang/rust/issues/71871. The output compiling for wasm32-unknown-unknown by rustc will change, not the output by wasm-bindgen.

Would this be a small work or a major rewrite?

My understanding is that this should actually be a fairly simple PR.

ranile commented 8 months ago

My understanding is that this should actually be a fairly simple PR.

If I, as someone who has never worked on rustc, were to take up on this, where would I even start?

temeddix commented 8 months ago

This is a change that has to be done in rustc, see https://github.com/rust-lang/rust/issues/71871

I left a comment at Rust repo to match the C ABI between wasm32-wasi and wasm32-unknown-unknown.

temeddix commented 8 months ago

There's a new PR at Rust that matches C ABI of wasm32-unknown-unknown with clang and wasm32-wasi.

It would be awesome if wasm-bindgen support wasm32-wasi, as it opens up whole new possibilities.

So currently, Rust's wasm32-unknown-unknown is the only wasm target that is using the legacy ABI, and this has to be replaced with a standard-ish ABI. The thing is, we have to come up with a simultaneous strategy that makes the change in both Rust and wasm-bindgen repositories, because wasm32-unknown-unknown and wasm-bindgen are tightly interconneted,

After talking with @daxpedda and @bjorn3, the best strategy seems to be accepting breaking changes and notifying the users in a proper way. I suggest that wasm-bindgen go on from 0.2.x to 0.3.0 that adopts unified Rust ABI for both wasm32-unknown-unknown and wasm32-wasi.

In this scenario, the new Rust(Possibly 1.75) can warn users using wasm-bindgen 0.2.x to upgrade, as @bjorn3 mentioned that warning about a specific crate's version is possible from the Rust side.

As a result, this strategy will solve two of the steps mentioned in OP at once.

This will also pave the way for a PR by @Rob2309 so that it can be finally merged.

I hope we can reach a certain consensus first, because supporting wasm32-wasi cannot be done without cooperation between Rust and wasm-bindgen. Any opinions and thoughts would be appreciated. Thank you :)

Liamolucko commented 8 months ago

So currently, Rust's wasm32-unknown-unknown is the only wasm target that is using the legacy ABI, and this has to be replaced with a standard-ish ABI. The thing is, we have to come up with a simultaneous strategy that makes the change in both Rust and wasm-bindgen repositories, because wasm32-unknown-unknown and wasm-bindgen are tightly interconneted,

I think I need to clarify that as of #3595, wasm-bindgen already supports the standard C ABI that wasm32-wasi uses. No changes are needed to wasm-bindgen to make it work with the new ABI, #3595 made it so that wasm-bindgen now works exactly the same regardless of which ABI is being used.

After talking with @daxpedda and @bjorn3, the best strategy seems to be accepting breaking changes and notifying the users in a proper way. I suggest that wasm-bindgen go on from 0.2.x to 0.3.0 that adopts unified Rust ABI for both wasm32-unknown-unknown and wasm32-wasi.

I don't think a breaking release is needed; that's only necessary if existing code might stop working after upgrading, which isn't the case since wasm-bindgen is still compatible with the old ABI.

temeddix commented 8 months ago

Thanks for the reply. Does that mean we can compile Rust for wasm32-wasi? If so, what change do we need? Would #3324 be helpful?

Liamolucko commented 8 months ago

Does that mean we can compile Rust for wasm32-wasi? If so, what change do we need?

The main thing stopping you from doing that right now is that #3233 recently changed WASI to behave like a native target, making it so that trying to call imports from JS panics. That PR had a valid reason for doing so, so you'd need to make it configurable whether WASI is treated as a native or JS target.

It might also be possible to do some shenanigans where wasm-bindgen generated imports panic by default, unless you run the CLI which replaces them with actual imports; that would mean pulling in the whole panic runtime though, which is a no-go unless walrus is able to detect that it's unused and remove it. The wasm file would also end up full of useless descriptor functions in the scenario where you don't run the CLI, although I guess code size is probably less of a concern outside of a web browser.

Would #3324 be helpful?

All of the ABI-related stuff is now redundant, but the WASI shims could still be useful.

rafaelbeckel commented 1 month ago

Hey all, is there a specific issue or PR matching the second item in the checklist, "Move the wasm32-unknown-unknown target to the C ABI"?

Would https://github.com/rust-lang/rust/pull/117919 be a match? If so, could you update the description to include the link in the checkbox? I'm not sure if the box should be checked, though. Maybe wait for the correct ABI to hit stable and become the default without a flag?

Also, the way they're rolling this out would probably match the third box, "Figure out a migration path for users": introduce a flag first, then add a lint warning for old versions of wasm-bindgen, then eventually make it the default.

daxpedda commented 1 month ago

The situation has turned out to be much simpler.

At some point, Rust will switch the wasm32-unknown-unknown target to use the C ABI by default, there is currently no PR for that because the lint (https://github.com/rust-lang/rust/pull/117918) has barely made it into Rust v1.78, so maybe the switch can happen in September, which I think is Rust v1.82. This switch requires no changes in wasm-bindgen and therefor no user migration path.

https://github.com/rust-lang/rust/pull/117919 introduced an unstable compiler flag, -Zwasm-c-abi, which you can use today with wasm-bindgen, to try out the C ABI. I added this to the OP now.

rafaelbeckel commented 1 month ago

If you arrived here while searching for how to build a Rust + C WASM binary for the wasm32-unknown-unknown target, here's a minimal example.

CryZe commented 1 week ago

LLVM is about to enable multivalue by default, which means that likely with LLVM 19, Rust would start running into problems if the spec compliant ABI isn't active by default by then. Though I believe Rust's targets could just deactivate that LLVM feature if it's a problem.

bjorn3 commented 1 week ago

Is it LLVM which will enable it by default or clang? Rustc doesn't enable any wasm features by default on any of the wasm targets (except for the features required for threads in case of the wasm32-wasip1-threads target), so I rustc should probably disable multivalue if LLVM starts enabling it by default. It would be interesting to see how wasi-sdk and Emscripten are going to handle it being on by default though as it is a big ABI break and not documented in https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md at all, so it would be hard for other languages to replicate the ABI LLVM uses when multivalue is enabled even though said languages may depend on clang compiled libraries.

Edit: If you were referring to https://github.com/llvm/llvm-project/pull/80923, then it wouldn't change the C abi, so disregard my comment other than the part about that rustc should probably disable the feature by default.

CryZe commented 1 week ago

Yeah I probably should've clarified that it does not change the ABI, however in at least #3552 it was necessary to use the standard C ABI for it to behave. For clang it already is enabled by default and thus probably emscripten as well. I'm not sure there've been any releases with it being enabled yet though.

daxpedda commented 1 week ago

Rustc doesn't enable any wasm features by default on any of the wasm targets ...

Last time I checked it does, see for example https://github.com/rust-lang/rust/issues/109807.

... rustc should probably disable multivalue if LLVM starts enabling it by default.

We will have to see if this breaks ABI or not, I also guess it won't.

However the reference type proposal will also be enabled by default, which is something we schould probably disable in Rustc.

Liamolucko commented 1 week ago

I'm not sure whether or not Rust's non-standard C ABI will be affected, but I think the solution I mentioned in #3552 is still applicable if it is:

Now that I think about it though, it should actually be fine to switch only return types to match the standard C ABI right away. The two ABIs are identical in how they handle return types, as long as multivalue is disabled, and wasm-bindgen is already broken when multivalue is enabled anyway. That would be enough to fix this.