dtolnay / linkme

Safe cross-platform linker shenanigans
Apache License 2.0
628 stars 42 forks source link

WASM support #6

Closed kabergstrom closed 1 year ago

kabergstrom commented 5 years ago

I investigated if it would be possible to add WASM support and it looks like there might be a way. While I have not found a similar start/end convention that works, WASM supports custom data sections which can be accessed as an array from JS. wasm_bindgen has a hidden and somewhat unsupported function that can get a handle to the WebAssembly.Module, while js-sys exposes bindings for fetching data from a custom section. This code compiles:

    let module = js_sys::WebAssembly::Module::new(&wasm_bindgen::module()).unwrap(); 
    let data = js_sys::WebAssembly::Module::custom_sections(&module, "thingies");

@ibaryshnikov created a more complete and runnable example here: https://github.com/ibaryshnikov/wasm-custom-sections/blob/master/src/lib.rs

Unfortunately, webpack does not expose the WebAssembly.Module object in javascript: https://github.com/webpack/webpack/issues/8157

Here's another similar issue: https://github.com/WebAssembly/esm-integration/issues/14

So this approach seems to work, but the WebAssembly.Module handle is not exposed in certain cases. Better than nothing I guess? 😁

daboross commented 5 years ago

I've been playing around with this a bit! It looks promising, but like there also might be other issues.

One is that #[link_section] on WASM current only supports exporting integers, byte arrays and structs wrapping those. Using any type with indirection or allocation in it results in:

error: statics with a custom `#[link_section]` must be a simple list of bytes on the wasm target with no extra levels of indirection such as references
 --> src/lib.rs:8:1
  |
8 | static A: fn() = x;

Not the end of the world, but this would currently prevent #[link_section] from backing typetag on WASM. Hopefully it's something which can be changed, like the Module being somewhat hidden?

Frizi commented 5 years ago

I was doing some research on how to tackle that issue. Unfortunately without success, but i've found many potential ways to get just a bit closer to solutions.

It seems like there is a relocation support for wasm custom sections in lld, this mechanism is unfortunately not utilized by rustc. I tried hacking this around in rustc by putting the static symbol direcly into the custom section (without converting it to raw bytes), but this didn't survived into the final wasm binary because lld ignores anything that's not MDString there. I believe that the relocations might have been performed, but my llvm knowledge is too limited to actually tell.

Additionally, because custom sections are stored in not addressable program metadata, having actual data stored there directly means it's no longer mutable. In other platforms, the distributed slice is literally aliasing the memory of individual static declarations. On wasm, there would have to be a separate copy of all data (in fact three total - individual statics, link section data and copy owned by distributed slice).

My guess is that there could be a way to retreive the addresses of those statics at runtime, then reference those with impl Iterator<Item = &T> instead of current &[T]. This could potentially be done with custom sections, as long as proper entries were generated for reloc.* section, so we would end up with a section of pointers. I haven't really found a way to do that without modifying rustc (and I'm not able to modify it the right way either), but I think this could still be researched more.

Alternatively, we could leverage the llvm.global_ctors symbol, but this also requires further support from both rustc and wasm-bindgen. More on that in the issue https://github.com/rustwasm/wasm-bindgen/issues/1216#issuecomment-533072447 .

adamspofford-dfinity commented 2 years ago

Has there been any update on the blockers for this? This would come in extremely handy.

AaronFriel commented 2 years ago

@dtolnay What's your opinion on the right direction to go here? It seems like there are two options from @Frizi's comment:

  1. Hack around the constraints on WebAssembly (more closely resembling a Harvard architecture) potentially requiring a non-contiguous "array" provided via Iterator<Item = &T>.

  2. Use llvm.global_ctors to put the initialization in the WebAssembly start function.

It seems like this may be a dichotomy, unless there's a third (or nth) option that you see available?

dtolnay commented 2 years ago

Neither of those is appropriate for this crate. inventory already exists, and uses global_ctors, and exposes Iterator<Item = &T>. If that's what someone needs, they should use that crate.

AaronFriel commented 2 years ago

Thanks, sorry, I was a bit confused about the future of the inventory crate, given that there's an issue for typetag to migrate to it.

daboross commented 2 years ago

Thanks, sorry, I was a bit confused about the future of the inventory crate, given that there's an issue for typetag to migrate to it.

From my understanding, there are two directions forward for typetag on WASM:

  1. add CTOR support for WASM to Rust (https://github.com/rust-lang/rust/issues/82371), then use this support in inventory's current model which uses CTORs
  2. add WASM support for custom sections containing pointers to Rust, then add WASM support to linkme (this issue) and switch typetag to linkme

I believe the typetag issue for switching to linkme - https://github.com/dtolnay/typetag/issues/15 - is predicated on the assumption that 2 is preferable (and possibly easier) than 1.

I believe it is preferable if possible: it doesn't require encoding life-before-main into rustc proper, and is thus cleaner.

Unfortunately, I think the small print for 2. makes it potentially much harder to implement: based on Frizi's comment above (https://github.com/dtolnay/linkme/issues/6#issuecomment-533089347), LLVM doesn't support doing the work to store pointers in custom sections on WASM. That isn't something we can work around - as I understand it, we need our linker to resolve pointers in the #[link_section] arrays, or it doesn't work at all.


This option is interesting to me:

Hack around the constraints on WebAssembly (more closely resembling a Harvard architecture) potentially requiring a non-contiguous "array" provided via Iterator.

Can we currently write code which does this? Specifically, can we do this without CTORs? (as CTORs aren't available on WASM)

I think it would belong in inventory rather than linkme, but like, if we could do that, it would be pretty powerful. The other two options for "typetag on WASM" both require changes to rustc to become possible, so I'd automatically prefer this if we can write it on current rustc.

The blocker here in my understanding isn't where things belong - it's getting anything to gather pointers from around a program without explicitly being called. There's no support for CTORs on WASM with rustc, and no support for properly linking pointers into custom sections on WASM either.

dtolnay commented 1 year ago

I am closing this issue because it hasn't turned out to be actionable in this crate. This comment has a writeup of what would be the next step for a rust compiler+library or compiler+language RFC to do distributed slices in the compiler instead of linker.