Allow explicitly naming imported function

tiziano88 commented 5 years ago

I am embedding the resulting Wasm code into a custom runtime that I am writing, and currently I need to de-mangle exported function names to remove the prefix and suffix that wasm-bindgen generates. I would like to have full control over the exported name, something like #[no_mangle] but also allowing setting an arbitrary name different from the Rust one. I am happy to look into this, if this does not sound like a terrible idea?

alexcrichton commented 5 years ago

Does the js_name attribute on exports solve your use case here?

tiziano88 commented 5 years ago

Thanks, I actually realised I meant imports rather than exports (i.e. controlling the name of functions called from the generated WebAssembly code). That said, imports also have a js_name attribute, but sadly this only controls part of the final generated name:

#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(js_name = renamed_test_function)]
    fn test_function(s: &str);
}

results in a Wasm module with the following Import (based on the output of wasm-objdump):

 - func[1] sig=1 <__wbg_renamedtestfunction_595a20b20932503a> <- __wbindgen_placeholder__.__wbg_renamedtestfunction_595a20b20932503a

Note that the name of the function still has a __wbg_ prefix, the name is still mangled to remove non-alphanumeric characters, and it is still prefixed with what appears to be the hash of the signature of the method. Perhaps we could have a #[wasm_bindgen(raw_name = blah)] attribute that controls the exact name of the symbol?

Pauan commented 5 years ago

@tiziano88 Your custom runtime runs JavaScript code? Why does it need to demangle the names, rather than resolving them in the same way as the standard Wasm<->JS spec?

Note that even though the imports/exports are mangled, they refer to an unmangled name in the generated .js code:

// Mangled name here
__exports.__wbg_f_renamed_test_function_test_function_n = function(arg0, arg1) {
    let varg0 = getStringFromWasm(arg0, arg1);

    // Unmangled name here
    renamed_test_function(varg0);
};

So as long as your renamed_test_function function exists in the global scope, then it will work.

Alternatively your runtime could expose a magical built-in module (similar to fs, path, etc. in Node), and then you could import that:

#[wasm_bindgen(module = "my-magical-module")]
extern "C" {
    #[wasm_bindgen(js_name = renamed_test_function)]
    fn test_function(s: &str);
}

This avoids polluting the global scope, but is otherwise the same.

tiziano88 commented 5 years ago

@Pauan thanks for the link, I had not seen it before, though it does not seem to explain the way mangling works? For my use case, I do not have any JS running at all, I have basically a Wasm interpreter running as a server, and implementing some host calls natively (e.g. to log or access the network etc). Because the runtime is agnostic of how the code was compiled (i.e. it should work for code compiled from Rust, C++, Go, etc), I would like to avoid coupling it to the naming conventions of wasm-bindgen specifically.

Pauan commented 5 years ago

@tiziano88 The name mangling is an implementation detail of wasm-bindgen, you shouldn't be relying upon it.

The link I gave is for JS code communicating with wasm code (which is wasm-bindgen's purpose).

If you don't have JS running, how do you intend to use wasm-bindgen? Wasm-bindgen generates JS code, and requires JS to run. It is intended for JS environments.

If you are running pure wasm (no JS), then you don't need wasm-bindgen at all, you can just do cargo build --target wasm32-unknown-unknown, since Rust has built-in support for wasm.

As for renaming, you can just create a thin fn wrapper:

extern "C" {
    fn renamed_test_function(s: &str);
}

#[inline]
pub unsafe fn test_function(s: &str) {
    renamed_test_function(s);
}

By making renamed_test_function private and only exposing test_function, that effectively "renames" the binding.

That also gives you the opportunity to provide safe/convenient wrappers, do pre/post validation, etc.

This article is written for C, but it will probably help you as well:

https://doc.rust-lang.org/nomicon/ffi.html

Similarly, you can go the other way, and expose functions to the wasm runtime:

#[no_mangle]
pub extern "C" fn foo(x: i32, y: i32) -> i32 {
    x + y
}

None of this requires wasm-bindgen.

However, wasm only supports simple types (i32, i64, f32, and f64), so for more complex things (like structs, or strings, or Vec) you'll need to do some marshalling.

wasm-bindgen does marshalling, however the marshalling assumes that it is running in a JS environment, so I don't think that will work for you.

So you'll need to create your own marshalling system. It might be possible to reuse the Rust<->C marshalling (I don't have any experience in this area).

If you can do that, it will provide a stable base to support multiple languages (since many languages have a C FFI).

You can also take a look at how wasm-bindgen does marshalling (it won't be exactly what you need, but it might give you some good ideas).

tiziano88 commented 5 years ago

@Pauan thanks, you are correct, it turns out I actually don't need wasm-bindgen at all; I can do almost everything using plain FFI compiling for wasm32-unknown-unknown (here is how to control the module of the expected imports https://rustwasm.github.io/book/reference/js-ffi.html , which is what I was missing). Also it seems that there is already some sort of serialisation built in, though I have not investigated what the actual format of that is?

I am still missing how to actually mark a Wasm start function using plain Rust FFI, which AFAICT using wasm-bindgen can be done using the #[wasm_bindgen(start)]. Any ideas how to do that?

Pauan commented 5 years ago

Also it seems that there is already some sort of serialisation built in, though I have not investigated what the actual format of that is?

It's probably the C marshalling (which is the default). Like I said, I don't have any experience with that, but maybe @alexcrichton knows more.

I am still missing how to actually mark a Wasm start function using plain Rust FFI, which AFAICT using wasm-bindgen can be done using the #[wasm_bindgen(start)]. Any ideas how to do that?

My understanding is that you shouldn't use the wasm start function. It has many limitations, it is not the same as main in Rust. (Please correct me if I'm wrong, @alexcrichton)

Instead, you should export a start function and then your custom runtime will call that (after passing in any necessary imports to the wasm module):

#[no_mangle]
pub extern "C" fn start() {
    println!("Hello!");
}

You can call it whatever you like. To avoid collisions, you might want to call it my_custom_runtime_start (or whatever).

alexcrichton commented 5 years ago

Ah ok, thanks for the update @tiziano88 and the help here @Pauan!

As for serialization, the default FFI boundary has no serialization, just a definition of what the ABI should look like. While wasm-bindgen provides some utilities for serialization, they're all opt-in and library-defined.

@Pauan is right in that the start function is "somewhat uncomfortable" in the sense that it's best to provide a symbol that the runtime manually calls. Rust currently has no native support for configuring the start function in the wasm output module, but we'd like to add that one day!

In any case I think this is probably a solved issue now (turns out wasm-bindgen wasn't the right target!), so I'm going to close this.

tiziano88 commented 5 years ago

Yes, I just managed to do everything I need (except start function) using standard FFI, so indeed there is no need to use wasm-bindgen for my use case. Thanks everyone for your support, much appreciated!

rustwasm / wasm-bindgen

Allow explicitly naming imported function #1128