rustwasm / wasm-bindgen

Facilitating high-level interactions between Wasm modules and JavaScript
https://rustwasm.github.io/docs/wasm-bindgen/
Apache License 2.0
7.8k stars 1.08k forks source link

Support enum variants with values #2407

Open joeriexelmans opened 3 years ago

joeriexelmans commented 3 years ago

Hi folks! This feature was discussed in the past in other issues, but it's been quiet for a while and there hasn't been an issue dedicated specifically to this feature, hence this:

Motivation

Right now, only C-style enums are supported. A C-style enum like this

#[wasm_bindgen]
enum CStyleEnum { A, B = 42, C }

is now converted to a mapping from tag to integer values:

const CStyleEnum = Object.freeze({ A: 0, B: 42, C: 43 });

In your code you can just use CStyleEnum.A, like you would in Rust.

Rust's support for Algebraic Data Types (ADTs) is an awesome feature that I love and use intensely. It would be nice if JavaScript bindings could be generated for an enum like this:

#[wasm_bindgen]
enum MyEnum {
    A,
    B(u32),
    C(u32, u32),
    D { x: u32, y: u32 },
}

Proposed Solutions

JavaScript does not have builtin support for "tagged unions" like Rust does, so a decision has to be made on how we "model" this. There was discussion on this in the earlier C-style enum issue, with 2 proposed solutions. I'll summarize them here.

Solution 1: Tag + array of values

This is the most straightforward solution. It reflects Rust's memory layout for enums.

Typescript definition:

type MyEnum =
  | { tag: 'A' }
  | { tag: 'B', value: number }
  | { tag: 'C', value: [number, number] }
  | { tag: 'D', value: { x: number, y: number }
  ;

I like how currently, for C-style enums, a mapping from enum identifier to an integer is created. We could further generalize this by creating a mapping from enum identifier to enum variant constructor:

const MyEnum_Tags = Object.freeze({ A: 0, B: 1, C: 2, D: 3 });
const MyEnum = Object.freeze({
  A: Object.freeze({ tag: MyEnum_Tags.A }), // just a value, like currently for C-style enums
  B: function() { this.tag = MyEnum_Tags.B; this.val = arguments }, // a constructor
  C: function() { this.tag = MyEnum_Tags.C; this.val = arguments },
  D: function() { this.tag = MyEnum_Tags.D; this.val = arguments },
});

and use it as follows:

 // C-style variant. Reflects usage in Rust, and current usage in JS
const a = MyEnum.A;

// Variants with values. Reflects how in Rust a variant with values is a function:
const b = new MyEnum.B(42); 
const c = new MyEnum.C(42, 43);
const d = new MyEnum.D({x: 42, y: 43});

JavaScript allows any function to be called with any list of parameters, so there's no way to prevent e.g. forgetting the number parameter for variant B, doing something like new MyEnum.B(), but that's just the way it is, in a dynamically typed language.

To "match" an enum variant, one can use JS switch-statements:

switch (variant.tag) {
case MyEnum_Tags.A:
  ...
case MyEnum_Tags.B:
  ...
}

if-else:

if (variant.tag === MyEnum_Tags.C) {
   const [x, y] = variant.val; // this is not too pretty, but values must be put under a field ("val"), in order to prevent naming collisions with the "tag" field.
   ...
}

or use a mapping:

const something = {
  [MyEnum_Tags.A]: ... ,
  [MyEnum_Tags.B]: ... ,
}[variant.tag];

Alternatives

Solution 2: Keep the tag as a key in the object

The following was also proposed:

type MyEnum =
  | { A: any }
  | { B: { '0': number } }
  | { C: { '0': number, '1': number } }
  | { D: { x: number, y: number } }
  ;

I'm personally not in favor, because one is forced to add a dummy value for variant 'A', which is inefficient:

const variant = { A: {} };

And this dummy value has to be truthy, in order for

if (variant.A) { ... }

to match.

Also, I think checking if an object contains a key is less efficient in (non-optimized?) JavaScript, compared to just checking the value of an integer field ("tag").

To "match" a variant, to my knowledge, one can only use if-statements.

if (variant.C) {
  const [x, y] = variant.C;
  ...
}

Additional info

I'm willing to put some work into this. Currently looking at the source of the wasm_bindgen macro to see if I can implement it myself.

alexcrichton commented 3 years ago

It seems reasonable to me to try to support this, thanks for writing this up! This is likely a pretty significant feature to implement, however, and I don't have a ton of time to help with design details myself. I'd be happy to help review though!

joeriexelmans commented 3 years ago

Nice to hear. I will also be a bit busy the coming weeks, but I'll get at it, eventually. Need this feature for a problem I'm working on.

iFreilicht commented 3 years ago

Awesome to see someone else had the idea before me! I do like Approach 1, though I'm not sure if one would get the proper hints by the IDE when calling the constructors. Still, perfectly reasonable and would be a huge improvement!

celsobonutti commented 3 years ago

That's a great feature that will surely help a lot in one of my projects.

You're proposals are awesome, @joeriexelmans, but what do you think about following what the genType ReScript library does with their variants? This seems to work really well, and could help in building a ReScript <-> wasm-bindgen interop.

I'd be more than happy to work on this with you, and am also currently reading the wasm_bindgen macro to get a grisp of how it works.

joeriexelmans commented 3 years ago

Hi @celsobonutti

From quickly reading the ReScript documentation, it seems to me that a downside is that matching a specific variant in JS is somewhat more complex, as a variant value may be either just a "string", or an object { tag, value }, for the same enum. Did I understand this correctly? Is there an upside I'm missing?

I'm still busy, until mid-March, so feel free to "take over" from me. I'll let you guys know soon as I've got time to work on this again.

celsobonutti commented 3 years ago

Hey @joeriexelmans

Yes, the bindings are indeed more complex. I got the wrong understanding about how C-like enums were compiled in wasm_bindgen and thought they were - like ReScript's - strings. Now that I saw how they're currently compiled, this change does not make sense for me anymore.

Sure, I'm pretty busy lately too, but I'll keep trying to understand the macro when I have free time. Thank you a lot for your quick response! 😃

celsobonutti commented 3 years ago

Hey @alexcrichton, how are you doing? I finally got some time to work on this (#2631), but looks like I had a misconception about WasmSlices (I thought I could have a slice with various ABI types) and now it looks to me that this feature would be way easier to implement using wasm multi-value returns. Is there any work on using multi-value in wasm-bindgen? I've searched for it but couldn't find anything 😞

jkomyno commented 2 years ago

Hi everyone, what would be needed to unlock this feature?

iFreilicht commented 2 years ago

@jkomyno basically, the PR #2631 has to be rebased and made ready to merge.

voxelstack commented 2 years ago

Is #2631 still relevant? I'm interested in this and would like to play around with it but starting from scratch might be too much. Getting a couple of pointers would also work.

wilfredjonathanjames commented 2 years ago

A common error handling pattern is to wrap errors from other libraries in an enum variant. Would be great to be able to return a JsValue error inside of our library's error enum, rather than try to map every single one to a unique variant.

If there's a better way to do this please let me know.

iFreilicht commented 2 years ago

@voxelstack Yes, kind of. It does work in that revision as far as I remember, so definitely your best starting point, but you'd have to rebase it onto master. Could be possible, could be a lot of work.

bouk commented 2 years ago

Another option would be to treat non-c-style enums like structs, so they're wrapped in a class on the JavaScript side

tgross35 commented 1 year ago

I think at a minimum, wasm-bindgen could copy what Serde currently lets you do. See this link: https://serde.rs/enum-representations.html, there are four possible representations:

// Externally tagged: enum variant is the key, enum content is the value
// Default representation
{"Request": {"id": "...", "method": "...", "params": {...}}}

// Internally tagged: the enum variant gets a key with the rest of the enum content
// `#[wasm_bindgen(tag = "type")]` on the Rust side
{"type": "Request", "id": "...", "method": "...", "params": {...}}

// Adjacently tagged: the enum content is represented within a separate key
// `#[wasm_bindgen(tag = "t", content = "c")]`
{"type": "Request", "c": {"id": "...", "method": "...", "params": {...}}}

// Untagged: no variant indicator, figure it out based on the content
// `#[wasm_bindgen(untagged)]`
{"id": "...", "method": "...", "params": {...}}

I think that there needs to be a couple possible representations, and mimicking the serde style gives a familiar (and proven) way to do that. In any case, exporting the enum variant constructors as js functions (e.g. MyEnum.C(42, 43)) also sounds like a good idea.

For anyone who needs a fast workaround, using serde_wasm_bindgen lets you pass enums to and from js like the examples above.

septatrix commented 1 year ago

The last one (untagged) would not work for enum where some variants have values and others not and the first one might lead. The first one (internally tagged) may lead to problems with field name conflicts and as per serde documentation does not work with enums containing tuple variants.

So I think the second one (adjacently tagged) or the default which you did not mention (externally tagged) is best suited.

tgross35 commented 1 year ago

I did miss externally tagged, thanks for pointing that out.

The point of my comment was more that I don't think wasm_bindgen does need to pick the "best" representation - instead it needs options, since no single representation is good in every case.

celsobonutti commented 1 year ago

Although I agree that multiple options would be the best, I think we need to focus on getting the feature working first. TS/JS representation is something we can do solely on the JS side, which I believe is easier to modify than actually passing the necessary data back and forth.

I, unfortunately, still have no time to continue working on my PR, but would love to be involved if someone can.

Anatoly03 commented 1 year ago

Is there a way to see if a JsValue is of an enum member declared in the rust code as of now or is everything like that to-do?

I have an enum that is used in the js code and then passed to the rust code on several other occasions, but I can't use the enum in the functions argument.

ThatXliner commented 1 year ago

What’s the status on this? I think I need this implemented, unless there’s other alternatives/workarounds

iFreilicht commented 1 year ago

@ThatXliner as a general rule, if there are no other comments on an issue, the status hasn't changed.

A comment like "What's the status?" is rarely productive and notifies everyone in the conversation, causing noise and distraction, please consider not posting comments like that in the future in highly active repositories.

If you think someone should see this issue, send it to them or mention them here, ideally with an explanation.

If you want to express your support or desire for this feature, upvote the original post.

If you want to show how this feature is useful in a way that hasn't been considered in this thread yet, feel free to post a minimal example. Something like this can be useful for generating test cases.

Thank you.

tgross35 commented 1 year ago

Correct me if I am wrong, but I believe this is just waiting on somebody to implement it (reads this as: it could be you!). It doesn’t seem like it would all that much work, especially since SerDe provides a good reference.

daxpedda commented 1 year ago

Probably needs a reviewer, I'm afraid this will be too big for me to find time to review. E.g. I didn't find the time to review https://github.com/rustwasm/wasm-bindgen/pull/3554 yet.

iFreilicht commented 1 year ago

Correct me if I am wrong, but I believe this is just waiting on somebody to implement it.

True, but so far nobody has done it. There was an attempt in https://github.com/rustwasm/wasm-bindgen/pull/2631, but the original author had no time to work on it further and it was never merged.

DylanRJohnston commented 9 months ago

Hey all, I've recently come across a need for this feature and I'm also between jobs and so have some spare time to work on something like this. However I'm not sure I understand what benefits this approach would provide over the one outlined in the documentation for communicating arbitrary data via serde_wasm_bindgen https://rustwasm.github.io/wasm-bindgen/reference/arbitrary-data-with-serde.html.

Especially since Rust structs exported to JS must be Copy https://rustwasm.github.io/wasm-bindgen/reference/types/exported-rust-types.html so any enum with String is out of the question and the serde_wasm_bindgen approach does not have this problem.

The following example appears to work just fine, and since the implementations ofWasmDescribe, IntoWasmAbi, and FromWasmAbi are all mechanical delegations to serde_wasm_bindgen they could be easily produced with a simple macro

#[derive(Serialize, Deserialize)]
#[serde(tag = "type", content = "data")]
pub enum TestEnum {
    Foo { number: i32 },
    Bar { string: String },
}

impl From<TestEnum> for JsValue {
    fn from(val: TestEnum) -> Self {
        serde_wasm_bindgen::to_value(&val).unwrap()
    }
}

impl wasm_bindgen::describe::WasmDescribe for TestEnum {
    fn describe() {
        JsValue::describe()
    }
}

impl wasm_bindgen::convert::IntoWasmAbi for TestEnum {
    type Abi = <JsValue as IntoWasmAbi>::Abi;

    fn into_abi(self) -> Self::Abi {
        serde_wasm_bindgen::to_value(&self).unwrap().into_abi()
    }
}

impl wasm_bindgen::convert::FromWasmAbi for TestEnum {
    type Abi = <JsValue as FromWasmAbi>::Abi;

    unsafe fn from_abi(js: Self::Abi) -> Self {
        serde_wasm_bindgen::from_value(JsValue::from_abi(js)).unwrap()
    }
}

#[wasm_bindgen(js_name = "newTestEnum")]
pub fn new_test_enum() -> TestEnum {
    TestEnum::Foo { number: 42 }
}

#[wasm_bindgen(js_name = "printTestEnum")]
pub fn print_test_enum(test_enum: TestEnum) {
    match test_enum {
        TestEnum::Foo { number } => console_log!("Foo encountered with value: {}", number),
        TestEnum::Bar { string } => console_log!("Bar encountered with value: {}", string),
    }
}

The main limitation I can think of is that it won't produce Typescript type definitions for the Enum, you can define them manually https://rustwasm.github.io/wasm-bindgen/reference/attributes/on-js-imports/typescript_type.html, but keeping them in sync would be a pain. However I'm not sure this limitation warrants the effort. Maybe it'd be enough to outline this approach for enums in the documentation?

DylanRJohnston commented 9 months ago

Here's a procedural macro for it, the macro attributes get passed directly to serde so you can customize the the tagging convention.

// lib.rs
use proc_macro::TokenStream;

mod serde_wasm_bindgen;

#[proc_macro_attribute]
pub fn serde_wasm_bindgen(attr: TokenStream, item: TokenStream) -> TokenStream {
    serde_wasm_bindgen::expand_macro(attr.into(), item.into())
        .unwrap_or_else(syn::Error::into_compile_error)
        .into()
}
// serde_wasm_bindgen.rs
use proc_macro2::TokenStream;
use quote::{quote, ToTokens};
use syn::{spanned::Spanned, Error, Item};

pub fn expand_macro(serde_attr: TokenStream, tokens: TokenStream) -> syn::Result<TokenStream> {
    let item = syn::parse2::<Item>(tokens)?;
    match item {
        Item::Struct(it) => inner_macro(it.ident.clone(), serde_attr, it.to_token_stream()),
        Item::Enum(it) => inner_macro(it.ident.clone(), serde_attr, it.to_token_stream()),
        item => Err(Error::new(
            item.span(),
            "serde_wasm_bindgen macro can only be applied to structs or enums",
        )),
    }
}

fn inner_macro(
    ident: proc_macro2::Ident,
    serde_attr: TokenStream,
    tokens: TokenStream,
) -> Result<TokenStream, Error> {
    let pound = syn::Token![#](tokens.span()).to_token_stream();

    Ok(quote! {
      #pound[derive(Serialize, Deserialize)]
      #pound[serde(#serde_attr)]
      #tokens

      impl From<#ident> for JsValue {
        fn from(val: #ident) -> Self {
            serde_wasm_bindgen::to_value(&val).unwrap()
        }
      }

      impl TryFrom<JsValue> for #ident {
        type Error = serde_wasm_bindgen::Error;

        fn try_from(value: JsValue) -> Result<Self, Self::Error> {
            serde_wasm_bindgen::from_value(value)
        }
      }

      impl wasm_bindgen::describe::WasmDescribe for #ident {
        fn describe() {
            JsValue::describe()
        }
      }

      impl wasm_bindgen::convert::IntoWasmAbi for #ident {
        type Abi = <JsValue as IntoWasmAbi>::Abi;

        fn into_abi(self) -> Self::Abi {
            Into::<JsValue>::into(self).into_abi()
        }
      }

      impl wasm_bindgen::convert::FromWasmAbi for #ident {
        type Abi = <JsValue as FromWasmAbi>::Abi;

        unsafe fn from_abi(js: Self::Abi) -> Self {
            JsValue::from_abi(js).try_into().unwrap()
        }
      }
    })
}
#[serde_wasm_bindgen(tag = "type", content = "data")]
pub enum TestEnum {
    Foo { number: i32 },
    Bar { string: String },
}

#[wasm_bindgen(js_name = "printTestEnum")]
pub fn print_test_enum(test_enum: TestEnum) {
    match test_enum {
        TestEnum::Foo { number } => console_log!("Foo encountered with value: {}", number),
        TestEnum::Bar { string } => console_log!("Bar encountered with value: {}", string),
    }
}

I made enum -> JsValue infallible as I don't think this can panic, however JsValue -> enum definitely can fail, so I've implemented TryFrom. However for the FromWasmAbi if someone has called this function with the wrong type I don't think there's anything we can sensibly do other than panic (hence the unwrap).

tinrab commented 9 months ago

@DylanRJohnston I've created a macro that does this for me. It generates TypeScript based on serde and uses serde_wasm_bindgen for ABI conversions. Here' a blog post, the crate (/bomboni_wasm), and an example.

RReverser commented 9 months ago

It generates TypeScript based on serde and uses serde_wasm_bindgen for ABI conversions.

FWIW there's also tsify, which does this as well.

j-nava commented 3 months ago

FWIW tsify seems to be a little abandoned right now, but there's an active and updated fork called tsify-next.