nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.25k stars 322 forks source link

Application restart breaks WebAssembly Components #1179

Closed lcrilly closed 3 months ago

lcrilly commented 3 months ago

According to the documentation I should be able to restart any application by making a GET request to the appropriate /control/applications/… API endpoint.

Unit handles the rollover gracefully, allowing the old processes to deal with existing requests and starting a new set of processes (as defined by the processes option) to accept new requests.

This is not working for WebAssembly Components. The restart request is accepted but subsequent calls to the application timeout.

Here is my reproducer based on this simple hello world component.

$ unitd --no-daemon --log /dev/stderr &
[1] 10450
$ 2024/03/07 09:48:42 [warn] 10450#5044676 Unit is running unprivileged, then it cannot use arbitrary user and group.
2024/03/07 09:48:42 [info] 10450#5044676 unit 1.32.0 started
2024/03/07 09:48:42 [info] 10451#5044678 discovery started
2024/03/07 09:48:42 [notice] 10451#5044678 module: wasm 0.1 "/opt/homebrew/lib/unit/modules/wasm.unit.so"
2024/03/07 09:48:42 [notice] 10451#5044678 module: wasm-wasi-component 0.1 "/opt/homebrew/lib/unit/modules/wasm_wasi_component.unit.so"
2024/03/07 09:48:42 [info] 10450#5044676 controller started
2024/03/07 09:48:42 [notice] 10450#5044676 process 10451 exited with code 0
2024/03/07 09:48:42 [info] 10453#5044681 router started
2024/03/07 09:48:42 [info] 10453#5044681 OpenSSL 3.2.1 30 Jan 2024, 30200010

$ tee /dev/stderr < minimal.json | unitc /config
{
    "listeners": {
        "*:9000": {
            "pass": "applications/hello"
        }
    },
    "applications": {
        "hello": {
            "type": "wasm-wasi-component",
            "component": "/home/lcrilly/wasm/hello_wasi_http.wasm"
        }
    },
    "settings": {
        "http": {
            "log_route": true
        }
    }
}
2024/03/07 09:49:02 [info] 10508#5044950 "hello" prototype started
2024/03/07 09:49:02 [info] 10509#5044951 "hello" application started
{
  "success": "Reconfiguration done."
}
$ curl localhost:9000
2024/03/07 09:49:10 [notice] 10453#5044687 *15 http request line "GET / HTTP/1.1"
Hello, wasi:http/proxy world!
$ unitc /control/applications/hello/restart
2024/03/07 09:49:26 [alert] 10450#5044676 sendmsg(8, -1, -1, 1) failed (54: Connection reset by peer)
2024/03/07 09:49:26 [alert] 10509#5044951 recvmsg(6, 2) failed (9: Bad file descriptor)
{
  "success": "Ok"
}
$ curl localhost:9000
2024/03/07 09:49:33 [notice] 10453#5044689 *25 http request line "GET / HTTP/1.1"
^C
lcrilly commented 3 months ago

Update.

This may or may not be related but after configuring a wasm-wasi-component application and then replacing the configuration with a different (single) application, the original Wasm app processes (prototype and app) persist (when they should not).

ac000 commented 3 months ago

Yes, sounds like something funky going on, looking at it...

ac000 commented 3 months ago

The processes weren't exit(2)ing..

Try this

diff --git a/src/wasm-wasi-component/src/lib.rs b/src/wasm-wasi-component/src/lib.rs
index 3ee40c4f..47c94bdc 100644
--- a/src/wasm-wasi-component/src/lib.rs
+++ b/src/wasm-wasi-component/src/lib.rs
@@ -4,6 +4,7 @@ use http_body_util::combinators::BoxBody;
 use http_body_util::{BodyExt, Full};
 use std::ffi::{CStr, CString};
 use std::mem::MaybeUninit;
+use std::process;
 use std::ptr;
 use std::sync::OnceLock;
 use tokio::sync::mpsc;
@@ -126,7 +127,7 @@ unsafe extern "C" fn start(
         bindings::nxt_unit_run(unit_ctx);
         bindings::nxt_unit_done(unit_ctx);

-        Ok(())
+        process::exit(0x0);
     })
 }
lcrilly commented 3 months ago

Thanks for looking!

I've done a cursory test with this patch and at first glance it works as expected. I can cargo component build && unitc /control/applications/…/restart with impunity.

Side note: it would be more convenient if libwasm_wasi_component.so was compiled with a _unit.so suffix so that it can be disovered without first renaming (or requiring make install to do that).

danielledeleo commented 3 months ago

I've done a cursory test with this patch and at first glance it works as expected. I can cargo component build && unitc /control/applications/…/restart with impunity.

I also replicated that the fix works. 👍

Side note: it would be more convenient if libwasm_wasi_component.so was compiled with a _unit.so suffix so that it can be disovered without first renaming (or requiring make install to do that).

I believe I ran into this too. Make doesn't appear pick up on changes to lib.rs.

ac000 commented 3 months ago

I also replicated that the fix works. 👍

Thanks!, I'll add both your Tested-by:'s

Side note: it would be more convenient if libwasm_wasi_component.so was compiled with a _unit.so suffix so that it can be disovered without first renaming (or requiring make install to do that).

You have to really fight against cargo to get it named differently, we do that for macOS... I suppose we could do it for the others, but I'm not sure if there is any other practical differences between cargo build ... and cargo rust ...

NXT_OS=$(uname -s)

if [ $NXT_OS = "Darwin" ]; then
        NXT_CARGO_CMD="cargo rustc --release --manifest-path src/wasm-wasi-component/Cargo.toml -- --emit link=target/release/libwasm_wasi_component.so -C link-args='-undefined dynamic_lookup'"
else
        NXT_CARGO_CMD="cargo build --release --manifest-path src/wasm-wasi-component/Cargo.toml"
fi

I believe I ran into this too. Make doesn't appear pick up on changes to lib.rs.

Heh, yes, a slightly different issue, but I also have that fixed. PR coming shortly...

ac000 commented 3 months ago

If either of you want to test the latest version of the patch, particularly if you have arm64 hardware... I'm really interested in if the language module builds without error.

ac000 commented 3 months ago

OK, I found an arm64 machine to test on and I suspected, it fails to build there...