rustwasm / wasm-bindgen

Facilitating high-level interactions between Wasm modules and JavaScript
https://rustwasm.github.io/docs/wasm-bindgen/
Apache License 2.0
7.68k stars 1.06k forks source link

Allow opaque but owning pointer indirection in bindgen'd types #3405

Open HeroicKatora opened 1 year ago

HeroicKatora commented 1 year ago

Motivation

In $business some core module is written in Rust. This module is then compiled to WASM in order to make the logic implemented in this this module accessible to a web-application. However, the primary goal is not to derive nor modify Javascript-state but rather that Javascript provides the IO-ful glue code to control the WebAssembly state via the web server. As such, the WASM state outlives any individual function invocation. (The executable model described this style as a ''reactor'').

We eagerly started by adding #[wasm_bindgen] sections and some connector handles to implement the interface. The idea was to retain a connector in JS which would call its methods when some event occurs, to send the appropriate state change to the module from outside its own main loop. But quickly noticed that there was a conflict: When the connector owns an attribute then this triggers bindgen to recursively require those attributes to also be sendable over the ABI. This is not what we want. The state we refer to is internal state which will be confined to the module and it contains several third-party types which may be impossible to defined as working with wasm-bindgen. Thus, the connectors can not refer to such internal (non-ABI-passed) state directly.

An Indirection over some owning pointers is not easily possible, either. The standard containers (Box, Vec) are only implemented for types that are introspectable, as well. The proposed solution is to change this and to provide standard inversions of reference types, so to speak.

Proposed Solution

We want to be able to export strongly typed, owning handle instances from WASM. So, let's add non-ABI-recursive versions of Box, Arc, etc. In our case, we did this via a macro since bindgen annotated types are also not yet allowed to be generic in any way.

Macro definition for `mk_box`, an opaque `Box` equivalent. ```rust #[macro_export] macro_rules! mk_box { ($(#[doc = $li:literal])* $v:vis struct $name:ident ($t:ty)) => { #[wasm_bindgen::prelude::wasm_bindgen] $v struct $name { /// The pointer of a box. For type and ABI purposes, this is a raw pointer instead so /// that wasm_bindgen 'does the right thing' which is *not* to look into the inner /// contents any further. box_ptr: *mut $t, } // Safety: only way to construct the first instance, from an existing box allocation. The // box itself is forgotten, passing all ownership of the instance and allocation to us. impl From<$crate::alloc::boxed::Box<$t>> for $name { fn from(boxed: $crate::alloc::boxed::Box<$t>) -> Self { $name { box_ptr: $crate::alloc::boxed::Box::leak(boxed), } } } /// Reconstruct a box by value. impl From<$name> for $crate::alloc::boxed::Box<$t> { fn from(this: $name) -> $crate::alloc::boxed::Box<$t> { let val = core::mem::ManuallyDrop::new(this); // Safety: the allocation is moved from `val`, which is afterwards invalid. Since // it is in a manually drop it is not used any further. unsafe { $crate::alloc::boxed::Box::from_raw(val.box_ptr) } } } impl core::ops::Deref for $name { type Target = $t; fn deref(&self) -> &$t { // Safety: The pointer was obtained from `Box::leak` which is a valid `$t`. unsafe { &*self.box_ptr } } } impl core::ops::DerefMut for $name { fn deref_mut(&mut self) -> &mut $t { // Safety: The pointer was obtained from `Box::leak` which is a valid `$t`. unsafe { &mut *self.box_ptr } } } impl Drop for $name { fn drop(&mut self) { // Safety: `box_ptr` owns an value and allocation by construction. let _ = unsafe { $crate::alloc::boxed::Box::from_raw(self.box_ptr) }; } } }; } ```

Proposed usage

use wasm_bindgen_opaque::mk_box;

#[allow(dead_code)]
struct TestInner {
    /// this type is impossible to 'own' in a JS module, not enough info.
    hidden: Box<dyn core::fmt::Display>,
}

mk_box! {
    /// This wrapper type contains a Box-indirection that can be sent over ABI.
    struct AbiInner(TestInner)
}

#[wasm_bindgen::prelude::wasm_bindgen]
pub struct Connector {
    inner: AbiInner,
}

#[wasm_bindgen::prelude::wasm_bindgen]
impl Connector {
    pub fn show_mut(&mut self) -> String {
        format!("{}", self.inner.hidden)
    }

    pub fn show_ref(&self) -> String {
        format!("{}", self.inner.hidden)
    }
}

Alternatives

Within the wasm-bindgen library there are probably better alternatives:

Additional Context

Since the macro solution can be implemented as a library, it may be possible to provide full code (or a release) on request.

Liamolucko commented 1 year ago

wasm-bindgen should only require struct fields to be sendable to JS if they're public. If JS doesn't actually need access to a field, you can disable exposing it to JS by either making it private or adding #[wasm_bindgen(skip)] to the field, e.g.:

#[wasm_bindgen]
pub struct Connector {
    #[wasm_bindgen(skip)]
    pub hidden: Box<dyn Display>,
}

// or

#[wasm_bindgen]
pub struct Connector {
    // note the lack of `pub`
    hidden: Box<dyn Display>,
}
HeroicKatora commented 1 year ago

Generally I'm more comfortable with having a type here instead of needing to tag each usage site with an attribute. Nevertheless, a wrapper with the attribute is probably also possible.

It doesn't fully address the problem, however. As far as I understand, values of type Connector will stay within the wbindgen heap when they are operated on (returned, used as parameters). Can this indirection be avoided? Since the same is not true when returning values of type String or some known box types for instance. In a sense, it would seem that some wasm-bindgen generated values are already allocated into a stack, so it would make sense to me if it were possible to just change where it is allocated in particular if the allocator is the global one. That would avoid having to move the value from one heap/stack to another when taking it by value, too.

But depending on that answer, should this issue also be changed to a documentation tag? There's no mention how skip influences the requirements that are calculated, and the error message when missing the annotation suggests the ABI may be involved. How exactly does this work anyways.

The error message without the attribute suggests, that each attribute must individually be sendable via the ABI. I don't know how feasible it is to improve the error message if the attribute is 'forgotten'.

error[E0277]: the trait bound `Box<(dyn std::fmt::Display + 'static)>: IntoWasmAbi` is not satisfied
 --> src/lib.rs:8:1
  |
8 | #[wasm_bindgen]
  | ^^^^^^^^^^^^^^^ the trait `IntoWasmAbi` is not implemented for `Box<(dyn std::fmt::Display + 'static)>`
Liamolucko commented 1 year ago

It doesn't fully address the problem, however. As far as I understand, values of type Connector will stay within the wbindgen heap when they are operated on (returned, used as parameters). Can this indirection be avoided?

Not really. You could theoretically take all the object's state from Rust's memory and put it in a JS object, but that would mean having to copy it back again every time you want to call a method, since Rust can only operate on things inside its own memory. So, it'd be far less efficient.

You might be thinking of the other 'heap' used by wasm-bindgen, which stores JS values that have been passed to Rust. But that isn't even involved when using Rust values passed to JS.

HeroicKatora commented 1 year ago

When returning a Rust value whose type has #[wasm_bindgen] annotation on it, where is that value stored? As far as I can tell the representation on Js side ends up being pretty much: {ptr:usize}. Maybe this isn't always the case but the mechanism here is not really known to me. To the best of my reading of code this behavior is controlled mainly by IntoWasmAbi.

In effect, I'd like to take a Box's allocation directly and move the resulting pointer of it directly into ptr. (This seems similar to the ownership passing of JsValue). Even better would be more control, i.e. passing a fat pointer as a Js object with two pointer-sized attributes could still sometimes be preferrable over the indirection, yet I totally understand that changing the ABI types is much more complicated. A Box<T> or equivalent wrapper would not require such a change.

In the above, the extra complexity via Connector wrapper is not actually desirable. If possible, passing only the opaqued boxed attribute value via transferring its pointer over ABI would be better. Yet the structure is necessary to tag a field with the proc-macro (even though I misunderstood the need for a custom privacy macro, it seems).