bytecodealliance / wasm-tools

CLI and Rust libraries for low-level manipulation of WebAssembly modules
Apache License 2.0
1.32k stars 241 forks source link

`wit-component` should accept modules with shared memories #1674

Open whitequark opened 3 months ago

whitequark commented 3 months ago

(Originally filed against jco as https://github.com/bytecodealliance/jco/issues/470).

To reproduce:

$ cat test.c
#include <stdio.h>

int main() {
  puts("hello");
}
$ .../wasi-sdk-22.0/bin/clang -target wasm32-wasip1-threads -pthread test.c -o test.wasm
$ jco new test.wasm --wasi-command -o test.component.wasm
(jco componentNew) ComponentError: failed to encode a component from module
$failed to validate component output

Caused by:
    type mismatch for export `memory` of module instantiation argument `env`
    mismatch in the shared flag for memories (at offset 0x7fd8)
    at componentNew (file:///home/whitequark/.npm-packages/lib/node_modules/@bytecodealliance/jco/obj/wasm-tools.js:2462:11)
    at componentNew (file:///home/whitequark/.npm-packages/lib/node_modules/@bytecodealliance/jco/src/cmd/wasm-tools.js:58:18)
    at async file:///home/whitequark/.npm-packages/lib/node_modules/@bytecodealliance/jco/src/jco.js:134:9

This example is extracted from a larger project (YoWASP Clang). LLVM currently does not compile for the wasm32-wasip1 target (it requires wasm32-wasip1-threads) because it extensively uses atomics even in a single-threaded build. This is difficult to change. However, an LTO build of LLVM/Clang/LLD (with the WebAssembly patchset that's likely going to be accepted in near future) doesn't import thread.spawn and so doesn't need to be treated differently.

Should this restriction be relaxed?

alexcrichton commented 3 months ago

At this time there's generally no support for shared memories with components. Excluding the actual ability to spawn threads there's not really any reason for this other than elbow grease has not been applied. In other words if you're not spawning threads it's mostly just todos here and there to get the stars to align correctly and support shared memories. Given this I think it would be reasonable to support this, but it would require changes at the component model validation layer to accept shared memories as the target for lifts/lowers/etc.

Once you actually spawn threads that's an entirely separate can of worms that I'd defer to the shared-everything-threads proposal.

whitequark commented 3 months ago

Thanks. To be clear I'm not asking for support for spawning threads in jco and friends. The root cause here is that LLVM really wants std::mutex even in single threaded builds which with the current state of wasi-sdk means that you will always have a shared memory in the output. This is actually not exclusive to LLVM, nextpnr for example is similar, but LLVM has the property that the upstream is unwilling to add conditional compilation for those mutexes, where with nextpnr I have been able to just commit those.

alexcrichton commented 3 months ago

Oh for sure and makes sense yeah. IMO the "best answer" here is https://github.com/WebAssembly/wasi-libc/issues/501 where std::mutex should definitely exist for all targets, even those that can't spawn threads.

whitequark commented 3 months ago

I have found that the "mismatch in the shared flag for memories" comes from this line:

https://github.com/bytecodealliance/wasm-tools/blob/263b697a442273ef9848e73546c770f5b7f6e2a9/crates/wasmparser/src/validator/types.rs#L4263

How should a fix look like? Is altering the subtype check the right thing to do here? I'm not completely sure if a shared memory is a subtype of a non-shared memory; intuitively I feel like it probably is (all operations for a non-shared memory apply to the shared memory but not vice versa), seeing as the atomics are defined as:

All atomic memory accesses can be performed on both shared and unshared linear memories.

But I'm not completely sure and I'd rather defer to an expert in this area.

alexcrichton commented 3 months ago

Oh no definitely no subtyping here, that would have ramifications on the core wasm level and affect quite a few moving pieces. I don't disagree with you that it's theoretically possible but that should be done as a first-class modification to the wasm specification. For example these are the formal semantics of matching with the threads proposal and I believe the intention is that there's no subtyping. (that would also affect JS which I'm not sure wants to be affected).

Anyway other than that there's a few somewhat related issues here which is going to make fixing this not exactly trivial:

Overall this I don't think is as simple as a game of whack-a-mole for now. I think the first thing to do is to get alignment at the component model level that shared memories are ok to use here. From there then it's a bit more whack-a-mole but there's a lot of places to touch. I'd expect this to require a nontrivial investment of time to push a change like this through.

whitequark commented 3 months ago

I see. I suppose what I'll do then is to use walrus to flip the shared flag or something ^^;

Thanks for writing it out though! I'm sure this will eventually come in handy.

alexcrichton commented 3 months ago

Alas yeah unfortunately I don't think there's a great way out of this. Another option would be to implement --no-shared-memory in LLD or something like that but even that is a bit sketchy.