bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.1k stars 1.26k forks source link

Wasmtime loops forever when `wasm32-wasi-threads` guests write to stdout #7745

Closed dicej closed 8 months ago

dicej commented 8 months ago

UPDATE: The original issue was due to not specifying -pthread when building. Now the problem is that Wasmtime loops indefinitely when a wasm32-wasi-threads guest writes to stdout. See https://github.com/bytecodealliance/wasmtime/issues/7745#issuecomment-1877546864 for steps to reproduce.

Original report follows:

Per https://github.com/WebAssembly/wasi-libc/issues/456#issuecomment-1866412804, a number of wasi-sdk and wasi-libc tests are currently failing when run using Wasmtime but run successfully on other runtimes. They seem to be related to thread-local storage (e.g. tests involving errno, which is a thread-local variable).

Test Case

This file was built from https://github.com/WebAssembly/wasi-sdk/blob/main/tests/general/signals.c for wasm32-wasi-threads using wasi-sdk:

signals.c.-O0.wasm.tar.gz

Steps to Reproduce

First, install any recent version of Wasmtime (e.g. v16.0.0). Then:

tar xf signals.c.-O0.wasm.tar.gz
wasmtime run signals.c.-O0.wasm

Expected Results

The expected output is:

psignal message for SIGINT: Interrupt

Actual Results

The actual output is:

Assertion failed: raise(_NSIG) == -1 && errno == EINVAL (: main: 31)
Error: failed to run main module `signals.c.-O0.wasm`

Caused by:
    0: failed to invoke command default
    1: error while executing at wasm backtrace:
           0: 0x1072 - <unknown>!abort
           1: 0x1179 - <unknown>!__assert_fail
           2:  0x716 - <unknown>!__original_main
           3:  0x1da - <unknown>!_start
    2: wasm trap: wasm `unreachable` instruction executed

Versions and Environment

Wasmtime version or commit: v16.0.0

Operating system: MacOS (but should repro on any OS)

Architecture: ARM64 (but should repro on any architecture)

Extra Info

See also https://github.com/WebAssembly/wasi-threads/pull/49, which may be related.

alexcrichton commented 8 months ago

What are the other runtimes this passes in? If I run this in node it produces the same error as Wasmtime using an adapted script from this issue

runner.js ``` const ITERS = 1; //////////////////////////////////////////////////////////////////////////////// class TextEncoder { constructor(enc) { if (enc != "utf-8") { throw new Error("FITZGEN: unsupported encoding: " + enc); } } encode(s, n) { let buf = new Uint8Array(s.length); for (let i = 0; i < Math.min(s.length, n); i++) { buf[i] = s.charCodeAt(i); // lol } return buf; } }; class TextDecoder { constructor(enc) { if (enc != "utf-8") { throw new Error("FITZGEN: unsupported encoding: " + enc); } } decode(buf) { let buf8 = new Uint8Array(buf); let s = ""; for (let i = 0; i < buf8.length; i++) { s += String.fromCharCode(buf8[i]); // lol } return s; } }; //////////////////////////////////////////////////////////////////////////////// async function ionCompile(wasm) { let module = await WebAssembly.compile(wasm); // while (!wasmHasTier2CompilationCompleted(module)) { // sleep(1); // } // wasmDis(module); return module; } function unimplemented(name) { return function (...args) { throw new Error(name + " is unimplemented! args = " + args); } } class ProcExitError extends Error { constructor(code) { this.code = code; this.message = "Program exited with code " + code; } toString() { return "ProcExitError: " + this.message } } function trace(obj) { let proxy = {}; for (let key of Object.keys(obj)) { if (typeof obj[key] != "function") { proxy[key] = obj[key]; continue; } proxy[key] = function (...args) { console.log("TRACE: " + key + "(" + args + ")"); let ret = obj[key](...args); console.log("TRACE: -> " + ret); return ret; }; } return proxy; } //////////////////////////////////////////////////////////////////////////////// const fs = require('fs'); async function main() { let smWasm = fs.readFileSync(process.argv[2]); let smModule = await ionCompile(smWasm); for (let i = 0; i < ITERS; i++) { let args = process.argv; let mem = null; let start = null; let unread = true; let smInstance = await WebAssembly.instantiate(smModule, trace({ env: { memory: new WebAssembly.Memory({ // shared: true, initial: 8000, maximum: 16000, }), }, bench: { start: function () { start = monotonicNow() }, end: function () { let end = monotonicNow(); console.log("ITER: " + (end - start)); } }, wasi: { 'thread-spawn': function() { throw new Error('thread-spawn'); }, }, wasi_snapshot_preview1: { fd_pread: function() { throw new Error('fd_pread'); }, poll_oneoff: function(in_, out, n) { console.log(n); // throw new Error('poll_oneoff'); return 0; }, random_get: function(a, b) { while (b > 0) { (new DataView(mem.buffer)).setInt8(a, 1, true); b -= 1; a += 1; } return 0; }, args_get: function(a, b) { for (let arg of args) { (new DataView(mem.buffer)).setInt32(a, b, true); a += 4; for (let c in arg) { (new DataView(mem.buffer)).setInt8(b, arg.charCodeAt(c), true); b += 1; } (new DataView(mem.buffer)).setInt8(b, 0, true); b += 1; } return 0; }, args_sizes_get: function(a, b) { let len = 0; for (let arg of args) { len += arg.length + 1; } (new DataView(mem.buffer)).setInt32(a, args.length, true); (new DataView(mem.buffer)).setInt32(b, len, true); return 0; }, clock_res_get: function () { return 1; }, clock_time_get: function(a, b, c) { const now = Math.round(performance.now() * 1000000); (new DataView(mem.buffer)).setBigInt64(c, BigInt(now), true); return 0; }, fd_filestat_get: function() { throw new Error('fd_filestat_get'); }, fd_read: function(fd, iovecs_ptr, iovecs_len, out_ptr) { let mem8 = new Uint8Array(mem.buffer); switch (fd) { case 4: let data = os.file.readFile("/home/nick/sightglass/benchmarks/spidermonkey/default.input.md"); let k = 0; for (let i = 0; i < iovecs_len && k < data.length && unread; i++) { let ptr = (new DataView(mem.buffer)).getUint32(iovecs_ptr + i * 8, true); let len = (new DataView(mem.buffer)).getUint32(iovecs_ptr + i * 8 + 4, true); for (let j = 0; j < len && k < data.length; j++) { mem8[ptr + j] = data.charCodeAt(k++); } } unread = false; (new DataView(mem.buffer)).setUint32(out_ptr, k, true); return 0; default: return 8; } }, fd_seek: function(fd, offset, whence, out_ptr) { switch (fd) { case 4: let len = os.file.readFile("/home/nick/sightglass/benchmarks/spidermonkey/default.input.md").length; (new DataView(mem.buffer)).setBigUint64(out_ptr, BigInt(len), true); return 0; default: return 8; } }, fd_write: function(a, b, c, d) { let s = ''; let total = 0; while (c > 0) { let base = (new DataView(mem.buffer)).getInt32(b, true); let len = (new DataView(mem.buffer)).getInt32(b + 4, true); b += 8; c -= 1; while (len > 0) { let c = new Uint8Array(mem.buffer)[base] s += String.fromCharCode(c); len -= 1; base += 1; total += 1; } } //console.log("fd_write(" + a + "): " + s.trimEnd()); console.log(s.trimEnd()); (new DataView(mem.buffer)).setInt32(d, total, true); return 0; }, fd_fdstat_set_flags: function() { throw new Error('fd_fdstat_set_flags'); }, path_filestat_get: function() { throw new Error('path_filestat_get'); }, path_open: function(fd, dirflags, path_ptr, path_len, oflags, fs_rights_base, fs_rights_inheriting, fdflags, out_ptr) { let buf = new Uint8Array(path_len); let mem8 = new Uint8Array(mem.buffer); for (let i = 0; i < path_len; i++) { buf[i] = mem8[path_ptr + i]; } let path = (new TextDecoder('utf-8')).decode(buf); switch (path) { case "default.input.md": (new DataView(mem.buffer)).setInt32(out_ptr, 4, true); return 0; default: console.log('denying path_open(' + path + ')'); return 2; } }, path_remove_directory: function() { throw new Error('path_remove_directory'); }, path_unlink_file: function() { throw new Error('path_unlink_file'); }, sched_yield: function() { throw new Error('sched_yield'); }, environ_get: function() { return 0; }, environ_sizes_get: function(a, b) { (new DataView(mem.buffer)).setInt32(a, 0, true); (new DataView(mem.buffer)).setInt32(b, 0, true); return 0; }, fd_close: function(fd) { return 0; }, fd_fdstat_get: function(fd, out_ptr) { (new DataView(mem.buffer)).setInt32(out_ptr, 0); // type switch (fd) { case 3: (new DataView(mem.buffer)).setInt32(out_ptr + 4, 3, true); break; case 4: (new DataView(mem.buffer)).setInt32(out_ptr + 4, 4, true); break; default: (new DataView(mem.buffer)).setInt32(out_ptr + 4, 0, true); break; } (new DataView(mem.buffer)).setInt32(out_ptr + 8, 0, true); (new DataView(mem.buffer)).setInt32(out_ptr + 12, 0, true); (new DataView(mem.buffer)).setInt32(out_ptr + 16, 0, true); (new DataView(mem.buffer)).setInt32(out_ptr + 20, 0, true); return 0; }, fd_prestat_get: function(a, b) { // case for preopened "." if (a == 3) { // discriminant for directory (new DataView(mem.buffer)).setInt32(b, 0, true); // one entry in directory (new DataView(mem.buffer)).setInt32(b + 4, 1, true); return 0; } return 8; }, fd_prestat_dir_name: function(fd, ptr, len) { let buf = (new TextEncoder('utf-8')).encode(".", 1); let mem8 = new Uint8Array(mem.buffer); for (let i = 0; i < Math.min(buf.length, len); i++) { mem8[ptr + i] = buf[i]; } mem8[Math.min(buf.length, len)] = 0; return 0; }, proc_exit: function(code) { throw new ProcExitError(code); }, }, })); mem = smInstance.exports.memory; try { smInstance.exports._start(); } catch (e) { if ((e instanceof ProcExitError) && e.code == 0) { continue; } throw e; } } } main().catch(function (e) { console.log("====== ERROR!!! ======"); console.log(e); console.log(e.stack); }); ```

If the only other runtime is WAMR (from the issue thread you linked then WAMR has been known in the past for non-spec-compliant behavior. If there are other runtimes though then the mystery will deepen.

dicej commented 8 months ago

Okay, I dug a bit deeper into this. Turns out all the wasi-libc tests are passing with Wasmtime 16, Toywasm 35, and Wasmer 4.2.5 as long as I build with -pthread -- if I don't build with that flag, they all fail. Lesson learned.

Meanwhile, I've also updated the wasi-sdk test script to build with -pthread. Now Toywasm and Wasmer are able to run the tests okay (modulo the fact that the .expected files are hard-coded to match Wasmtime, and the other runtimes format error output differently), but Wasmtime ends up busy looping forever as soon as any test tried to write to stdout, e.g.:

* thread #1, name = 'main', queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x000000018c9a6060 libsystem_kernel.dylib`kevent + 8
    frame #1: 0x0000000100a5d284 wasmtime`mio::waker::Waker::wake::h70f0de287550c690 + 64
    frame #2: 0x0000000100a4cb94 wasmtime`tokio::util::wake::wake_arc_raw::h1a32ae31a8ec5072 + 68
    frame #3: 0x0000000100a4c0f4 wasmtime`tokio::runtime::scheduler::current_thread::Context::park_yield::hc41080da61d04b53 + 320
    frame #4: 0x00000001003da4a8 wasmtime`tokio::runtime::context::scoped::Scoped$LT$T$GT$::set::hf092eaeb69914454 + 820
    frame #5: 0x00000001005a69b0 wasmtime`tokio::runtime::context::set_scheduler::h48cc529b1f87605e + 172
    frame #6: 0x00000001003dbb50 wasmtime`tokio::runtime::scheduler::current_thread::CoreGuard::block_on::h0aea4e87e0591d94 + 112
    frame #7: 0x00000001005b33bc wasmtime`tokio::runtime::context::runtime::enter_runtime::h2aa886ffe04dfc58 + 668
    frame #8: 0x00000001003ad284 wasmtime`wasmtime_wasi::preview2::in_tokio::hb195ebe39d79dde3 + 420
    frame #9: 0x00000001004a7bf0 wasmtime`_$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h04ccc3778290beca + 180
    frame #10: 0x00000001003ef1f0 wasmtime`wasmtime_runtime::instance::Instance::from_vmctx::hac10297a5cfbc8d9 + 148
    frame #11: 0x00000001004209a0 wasmtime`_$LT$F$u20$as$u20$wasmtime..func..IntoFunc$LT$T$C$$LP$wasmtime..func..Caller$LT$T$GT$$C$A1$C$A2$C$A3$C$A4$RP$$C$R$GT$$GT$::into_func::native_call_shim::h383012752bf42248 + 84
    frame #12: 0x00000001025c954c
    frame #13: 0x00000001025c7030
    frame #14: 0x00000001025c8650
    frame #15: 0x00000001025c882c
    frame #16: 0x00000001025c8a24
    frame #17: 0x00000001025c8300
    frame #18: 0x00000001025c8440
    frame #19: 0x00000001025c436c
    frame #20: 0x00000001025c6df0
    frame #21: 0x00000001025c42a0
    frame #22: 0x00000001025c91e8
    frame #23: 0x0000000100ede5e0 wasmtime`wasmtime_setjmp_16_0_0 + 100
    frame #24: 0x000000010065e638 wasmtime`wasmtime_runtime::traphandlers::_$LT$impl$u20$wasmtime_runtime..traphandlers..call_thread_state..CallThreadState$GT$::with::ha4d3eccbd3d85625 + 64
    frame #25: 0x00000001004f7530 wasmtime`wasmtime_runtime::traphandlers::catch_traps::hd22f75804c9e7d97 + 152
    frame #26: 0x000000010044c0a8 wasmtime`wasmtime::func::invoke_wasm_and_catch_traps::h92e67288af28d454 + 184
    frame #27: 0x000000010044e2ac wasmtime`wasmtime::func::Func::call_impl::h172a5645f5d101de + 1304
    frame #28: 0x0000000100495d10 wasmtime`wasmtime_cli::commands::run::RunCommand::load_main_module::h5baeac87cd95a968 + 2596
    frame #29: 0x0000000100494190 wasmtime`wasmtime_cli::commands::run::RunCommand::execute::hba50b58e459b0ce7 + 3056
    frame #30: 0x0000000100150474 wasmtime`wasmtime::Wasmtime::execute::h2578284636f5c3af + 296
    frame #31: 0x000000010014df80 wasmtime`wasmtime::old_cli::main::h4ab441c89575015e + 580
    frame #32: 0x0000000100155530 wasmtime`std::sys_common::backtrace::__rust_begin_short_backtrace::h90399811d0439507 + 12
    frame #33: 0x0000000100153ea0 wasmtime`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h5450f370f5c6d820 + 24
    frame #34: 0x000000010107f17c wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::hdb9d57f6813b6639 at function.rs:284:13 [opt]
    frame #35: 0x000000010107f174 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] std::panicking::try::do_call::h3316be3da4e712e4 at panicking.rs:504:40 [opt]
    frame #36: 0x000000010107f174 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] std::panicking::try::h72745351e21d17fd at panicking.rs:468:19 [opt]
    frame #37: 0x000000010107f174 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] std::panic::catch_unwind::h2df3aeb4d14c657e at panic.rs:142:14 [opt]
    frame #38: 0x000000010107f174 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h40cc69f0b809216f at rt.rs:148:48 [opt]
    frame #39: 0x000000010107f174 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] std::panicking::try::do_call::hdf604766e4bed556 at panicking.rs:504:40 [opt]
    frame #40: 0x000000010107f170 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] std::panicking::try::h52de1aaf80258115 at panicking.rs:468:19 [opt]
    frame #41: 0x000000010107f170 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 [inlined] std::panic::catch_unwind::hec154f66f8285e2b at panic.rs:142:14 [opt]
    frame #42: 0x000000010107f170 wasmtime`std::rt::lang_start_internal::haeade8bb0e866fa6 at rt.rs:148:20 [opt]
    frame #43: 0x0000000100151898 wasmtime`main + 52
    frame #44: 0x000000018c687f28 dyld`start + 2236

The stack trace is a bit different each time I take a sample, but it's generally spinning in tokio or mio.

abrown commented 8 months ago

Maybe a wait is not being notify-ed? I added #7220 to help diagnose this kind of thing: does running with WASMTIME_LOG=wasmtime_runtime=trace help clarify things?

dicej commented 8 months ago
 $ WASMTIME_LOG=wasmtime_runtime=trace  wasmtime argc_argv_main.c.-O0.wasm
2024-01-04T18:06:50.745040Z TRACE wasmtime_runtime::memory: memory.atomic.notify(addr=0xb58, count=4294967295)

Then it hangs with no further output.

For reference:

// argc_argv_main.c
#include <stdio.h>

int main(int argc, char *argv[]) {
    puts("hello from argc argv main!");
    return 0;
}
/opt/wasi-sdk/bin/clang --sysroot=/opt/wasi-sdk/share/wasi-sysroot --target=wasm32-wasi-threads -pthread -O0 argc_argv_main.c -o argc_argv_main.c.-O0.wasm