WebAssembly / wasi-sdk

WASI-enabled WebAssembly C/C++ toolchain
Apache License 2.0
1.28k stars 192 forks source link

Compiling libz-ng using wasi-sdk produces an invalid binary #492

Open bjorn3 opened 1 month ago

bjorn3 commented 1 month ago

The blogpost-uncompress example of zlib-rs includes the C libz-ng library. Without libz-ng, zlib-rs compiles fine for wasm.

Reproduction

$ git clone https://github.com/trifectatechfoundation/zlib-rs
$ cd zlib-rs
$ git apply <<EOF
diff --git a/test-libz-rs-sys/Cargo.toml b/test-libz-rs-sys/Cargo.toml
index 47d22c1..ee4ddf6 100644
--- a/test-libz-rs-sys/Cargo.toml
+++ b/test-libz-rs-sys/Cargo.toml
@@ -25,5 +25,5 @@ libz-sys.workspace = true
 quickcheck.workspace = true
 crc32fast = "1.3.2"

-libloading.workspace = true
-dynamic-libz-sys.workspace = true
+#libloading.workspace = true
+#dynamic-libz-sys.workspace = true
EOF
$ CC=/path/to/wasi-sdk-24.0-x86_64-linux/bin/clang cargo build -p test-libz-rs-sys --example blogpost-uncompress --release --target wasm32-wasip1
$ wasmtime compile target/wasm32-wasip1/release/examples/blogpost-uncompress.wasm
Error: WebAssembly translation error

Caused by:
    Invalid input WebAssembly code at offset 108104: type mismatch: values remaining on stack at end of block
$ node
> const wasmBuffer = fs.readFileSync('target/wasm32-wasip1/release/examples/blogpost-uncompress.wasm');
undefined
> let mod = await WebAssembly.compile(wasmBuffer)
Uncaught:
CompileError: WebAssembly.compile(): Compiling function #191:"adler32_stub" failed: expected 1 elements on the stack for fallthru, found 3 @+104593
fitzgen commented 1 month ago

Given that both Wasmtime and node consider this an invalid Wasm, I suspect that the toolchain really is producing an invalid Wasm, rather than that both validators have the same bug.

So I suspect this is ultimately a bug in either LLVM or wasm-ld.

alexcrichton commented 1 month ago

I found out a bit more with:

$ wasm-tools validate target/wasm32-wasip1/release/examples/blogpost-uncompress.wasm
error: /home/alex/.cargo/registry/src/index.crates.io-6f17d22bba15001f/libz-sys-1.1.20/src/zlib-ng/functable.c:300:5 function `adler32_stub` failed to validate

Caused by:
    0: func 191 failed to validate
    1: type mismatch: values remaining on stack at end of block (at offset 0x192ee)

which points to this location. Which ended up not being that illuminating.

Looking at the disassembly I see:

(;@191c2 ;)  (func $adler32_stub (;191;) (type 2) (param i32 i32 i32) (result i32)
(;@191c3 ;)    (local i32 i32 i32)
(;@191c5 ;)    global.get $__stack_pointer
(;@191cb ;)    i32.const 16
(;@191cd ;)    i32.sub
(;@191ce ;)    local.tee 3
(;@191d0 ;)    global.set $__stack_pointer
(;@191d6 ;)    global.get $GOT.data.internal.__memory_base
(;@191dc ;)    local.set 4
(;@191de ;)    local.get 3
(;@191e0 ;)    i32.const 15
(;@191e2 ;)    i32.add
(;@191e3 ;)    call $cpu_check_features
(;@191e9 ;)    local.get 4
(;@191eb ;)    i32.const 7453020
(;@191f1 ;)    i32.add
(;@191f2 ;)    local.tee 4
(;@191f4 ;)    global.get $GOT.func.internal.adler32_c
(;@191fa ;)    i32.store offset=4
(;@191fd ;)    local.get 4
(;@191ff ;)    global.get $GOT.data.internal.__table_base
(;@19205 ;)    local.tee 5
(;@19207 ;)    i32.const 40
(;@1920d ;)    i32.add
(;@1920e ;)    i32.store
(;@19211 ;)    local.get 4
;; ...
(;@192d5 ;)    local.get 1
(;@192d7 ;)    local.get 2
(;@192d9 ;)    call $_start
(;@192df ;)    local.set 4
(;@192e1 ;)    local.get 3
(;@192e3 ;)    i32.const 16
(;@192e5 ;)    i32.add
(;@192e6 ;)    global.set $__stack_pointer
(;@192ec ;)    local.get 4
             )

which also isn't too illuminating. I'm seeing GOT.* things though which looks like this is compiled with -fPIC which might be the cause of the issue. Dynamic linking, which I don't think is intended here, might need more careful setup to align things? (Rust isn't trying to do that by default and probably is missing a linker flag or something like that)

bjorn3 commented 1 month ago

Disabling -fPIC indeed fixes the issue. It seems that the cc crate by default passes it most targets, including wasm. Is there a reason mixing -fPIC and -fno-pic doesn't work? It works fine on native targets afaik.

alexcrichton commented 1 month ago

I believe it's intended to work, but that's at least good for helping to narrow down the issue! I'm not familiar enough with -fPIC myself though to know how best to debug this further and how to reduce. Ideally there'd be just a standalone *.c file or two with clang flags to reproduce, and that'd probably be best for getting this fixed in LLVM

sbc100 commented 1 month ago

yes, this sounds like some kind llvm/lld issue with linking -fPIC code into static binaries. It should indeed be supported (although with some binary side and runtime overheard and would require wasm-opt or similar to then remove).

Can you open an llvm bug for this? If possible with a simple reproducer attached.

bjorn3 commented 1 month ago

Pushed a reproducer which barely uses any rust (aside from to build zlib-ng) to https://github.com/bjorn3/wasi-sdk-issue-492 Turns out that opt-level=1 is necessary for the bug to show up.

alexcrichton commented 1 month ago

One reduction via creduce is

struct table {
  int (*f1)(char, int);
  int (*f2)(void);
} functable;

int g(char, int);

void h() {
  struct table t;
  t.f1 = g;
  __atomic_store(&functable.f1, &t.f1, 5);
  __atomic_store(&functable.f2, &t.f2, 5);
}
void i(char buf, int j) {
  h();
  functable.f1(buf, j);
}
$ clang -fPIC -O1 src.c -c -o foo.o -g
$ wasm-tools validate foo.o
error: /home/alex/code/wasi-sdk-issue-492/build/src.c:16:3 function `i` failed to validate

Caused by:
    0: func 2 failed to validate
    1: type mismatch: expected a type but nothing on stack (at offset 0xf1)

I'm not sure if that's the exact same issue as the one here, but it looks similar at least.

bjorn3 commented 1 month ago

Have you already opened an llvm issue or do you mind if I open one?

alexcrichton commented 1 month ago

I haven't yet, please feel free!

bjorn3 commented 1 month ago

Opened https://github.com/llvm/llvm-project/issues/111855

bjorn3 commented 1 month ago

The latest version of the cc crate has stopped passing -fPIC on wasm.