rust-diplomat / diplomat

Experimental Rust tool for generating FFI definitions allowing many other languages to call Rust code
https://rust-diplomat.github.io/book/
Other
499 stars 47 forks source link

Rust Wasm ABI is unstable and will be changed eventually #661

Open Manishearth opened 3 weeks ago

Manishearth commented 3 weeks ago

Previously: https://github.com/rust-diplomat/diplomat/issues/657 (talking about the specific issue we found)

As explained in https://github.com/rust-lang/rust/issues/129486, the Rust Wasm ABI is currently not following tool conventions and is unstable. https://github.com/rust-lang/rust/pull/117919 details the plan to move forward from here, with a -Zwasm-c-abi=spec flag that will become default over time.

This ABI is one where all non-scalar structs are passed as pointers, which means that JS will have to preallocate all structs (except those that wrap a single scalar) and pass them in as pointers. More expensive, but fine.

On the plus side, we can get rid of a lot of the finicky code from https://github.com/rust-diplomat/diplomat/pull/660.

We don't have to do this now, but we should before Rust changes defaults. When we do we should document it. We can potentially support both forms if people really ask for it, but it could get pretty gnarly.

sffc commented 3 weeks ago

Do we have an idea of how wasm-c-abi=spec impacts the structs ICU4X actually uses?

Knowing how "crossing the WASM ABI" hasn't seemed to have been an area where enough people really have put enough effort, there might be room to influence the outcome upstream, especially if we can share performance numbers about how much faster certain real-world ICU4X operations might be in the unstable Rust Wasm ABI versus the "spec" Wasm ABI.

RalfJung commented 3 weeks ago

The ABI that rustc currently uses is effectively unsound due to https://github.com/rust-lang/rust/issues/115666 -- or at least, it is highly unclear whether it satisfies our ABI compatibility rules. So I don't think we should keep using it under any circumstances. If the wasm ABI can be improved, that should be done as a wasm-wide change, not by rustc unilaterally using a different extern "C" ABI than the reset of the ecosystem. If Rust wants to offer a more efficient ABI on that target, that should be a different ABI string (not "C"), and the ABI needs to be actually designed properly instead of just exposing codegen implementation details in unsound ways like the current ABI. The current ABI is entirely an accident, not something that was ever intended to work the way it does.

Manishearth commented 3 weeks ago

Do we have an idea of how wasm-c-abi=spec impacts the structs ICU4X actually uses?

It basically means that Diplomat slices and ICU4X options bags need to be allocated on the Rust heap before being passed to Rust from JS.

Instead of:

wasm.foo(field1, field2);

we will have to do


let buffer = diplomat_alloc(enough space for struct);
ptrWrite(buffer, 0, field1);
ptrWrite(buffer, offset, field2);
wasm.foo(buffer.ptr);
``

There are wasm-wide discussions on doing better here, as Ralf [linked](https://github.com/WebAssembly/tool-conventions/issues/88). We can participate, I'm not sure if we have anything further to bring to the table than what is already being said there.

They seem to be moving towards "pass small structs as fields, pass large structs by pointer", which will kind of be similar to what Rust already does, but with more complexity because of the pointer (whereas currently it's "pass small structs as fields, pass large structs as fields _with padding_").
sffc commented 3 weeks ago

I care slightly less about options bags, but things like DiplomatWriteable and DiplomatSlice are in the critical path for hot functions like format terminals, so I would hope that those don't need an alloc.

I guess a workaround would be to backtrack a bit and inline those structs back to separate parameters in the C functions. Maybe only when the target is WASM.

bjorn3 commented 3 weeks ago

You can manually split slices into two arguments when generating the extern "C" functions instead of depending on the compiler doing this, right?

Manishearth commented 3 weeks ago

I guess a workaround would be to backtrack a bit and inline those structs back to separate parameters in the C functions. Maybe only when the target is WASM.

Generally not a great idea since Diplomat's design is that at the C level it's always the same.

And in particular this would mean complicating the macro again (and in a way that's conditional on wasm!), which isn't great.

You can manually split slices into two arguments when generating the extern "C" functions instead of depending on the compiler doing this, right?

Yes, and we used to do this, but now we consistently use a DiplomatSlice repr(C) struct, which greatly simplifies the code.

So yeah, in theory something we can do, in practice would rather not choose this route.

but things like DiplomatWriteable and DiplomatSlice are in the critical path for hot functions like format terminals, so I would hope that those don't need an alloc.

Note that allocs aren't that bad here. Allocs are bad in native code, but JS hits the allocator all the time, and while the Rust-side allocator in wasm is not the JS allocator, hitting it is not that expensive compared to other things in JS. I don't think this slows down format functions in a way that's going to be super noticeable. Measurable, probably, noticeable, probably not, it's going to be a same-order-of-magnitude change.

Manishearth commented 3 weeks ago

Either way, the proposed ABI of "small structs are passed directly, large structs indirectly" woudl solve this for us.

bjorn3 commented 3 weeks ago

I think you can avoid the allocator by storing everything on the stack instead. You can decrement the stack pointer global before the call, store the arguments at the new stack pointer location, perform the call and restore the stack pointer. This will need the stack pointer to be exported by the wasm module though, or alternatively you need to export a function which adjusts the stack pointer as requested while not touching the stack at all. The latter probably requires hand written wasm.