nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.25k stars 322 forks source link

Some wasm-wasi-component related fixes #1181

Closed ac000 closed 3 months ago

ac000 commented 3 months ago

This PR comprises two fixes

Missing dependencies in the Makefile

Terminating applications must call exit(2)

callahad commented 3 months ago

I'm still wrapping my head around the bindings (and nxt_application.c), but I'm not fully certain of the approach to exiting.

For one, it seems like we should call exit from outside of the closure passed to handle_result. I think we might short-circuit handle_result's error handling otherwise.

Second, if we're directly exiting, the start function signature should change to return nothing instead of an nxt_int_t... but I also haven't fully convinced myself that calling exit directly from our application code is the right thing here.

ac000 commented 3 months ago

Rebase with master

$ git range-diff 3c2ae2dc...3eddef82
 -:  -------- >  1:  63bc3882 .mailmap: Map Dylan's 2nd GitHub address
 -:  -------- >  2:  f6899af6 Var: Fix cacheable issue for njs variable access
 -:  -------- >  3:  5511593d Remove support for Microsoft's Visual C++ compiler
 -:  -------- >  4:  0c2d7786 Remove support for Intel's icc compiler
 -:  -------- >  5:  e79e4635 Remove support for IBM's XL C compiler
 -:  -------- >  6:  9cd11133 Remove support for Sun's Sun Studio/SunPro C compiler
 -:  -------- >  7:  806e209d Remove -W from compiler flags
 -:  -------- >  8:  1dcb5383 Expand the comment about -Wstrict-overflow on GCC
 -:  -------- >  9:  0b5223e1 Disable strict-aliasing in clang by default
 -:  -------- > 10:  c1e3f02f Compile with -fno-strict-overflow
 -:  -------- > 11:  280a978d Add initial infrastructure for pretty printing make output
 -:  -------- > 12:  5d831af0 Hook up make pretty printing to the Unit core and tests
 -:  -------- > 13:  da335bec Pretty print the Java language module compiler output
 -:  -------- > 14:  574528f7 Pretty print the Perl language module compiler output
 -:  -------- > 15:  0a0dcf91 Pretty print the PHP language module compiler output
 -:  -------- > 16:  caaa1d28 Pretty print the Python language module compiler output
 -:  -------- > 17:  133f75fd Pretty print the Ruby language module compiler output
 -:  -------- > 18:  b763ba7e Pretty print the wasm language module compiler output
 -:  -------- > 19:  15072fbd Enable optional 'debuggable' builds
 -:  -------- > 20:  d23812b8 Allow to disable -Werror at 'make' time
 -:  -------- > 21:  f55fa70c Add a help target to the root Makefile
 -:  -------- > 22:  a171b399 Add an EXTRA_CFLAGS make variable
 -:  -------- > 23:  6b138571 Wasm-wc: Bump the mio crate from 0.8.10 to 0.8.11
 -:  -------- > 24:  2e615250 Add dependabot.yml
 -:  -------- > 25:  0d99744d Router: match when pattern and tested string are both zero length
 -:  -------- > 26:  fdc46759 NJS: avoiding arithmetic ops with NULL pointer in r->args
 -:  -------- > 27:  8844d33c Fixed undefined behaviour in left shift of int value
 -:  -------- > 28:  7dcd6c0e Avoiding arithmetic ops with NULL pointer in nxt_http_arguments_parse
 -:  -------- > 29:  264b3755 Avoiding arithmetic ops with NULL pointer in nxt_port_mmap_get
 -:  -------- > 30:  c9461a6b Initialize port_impl only when it is needed
 -:  -------- > 31:  dd701fb4 Avoiding arithmetic ops with NULL pointer in nxt_unit_mmap_get
 -:  -------- > 32:  abcfc4cd Fix the security-alert email link in the README
 1:  d21d4824 = 33:  4c033098 Rebuild wasm-wasi-component when any of its dependencies change
 2:  3c2ae2dc = 34:  3eddef82 Wasm-wc: Fix application restarts
ac000 commented 3 months ago

For one, it seems like we should call exit from outside of the closure passed to handle_result. I think we might short-circuit handle_result's error handling otherwise.

I've moved the exit(2) to outside the closure (assuming I know what that is...)

Second, if we're directly exiting, the start function signature should change to return nothing instead of an nxt_int_t... but I also haven't

That function has a signature of

typedef nxt_int_t (*nxt_process_start_t)(nxt_task_t *task,                      
    nxt_process_data_t *data);

We can signal errors in this function (i.e return NXT_ERROR) before we call

nxt_unit_run(unit_ctx);

which is the main event loop.

So it's right and I guess it was generated from bindgen or something.

fully convinced myself that calling exit directly from our application code is the right thing here.

That's how the module API works.

In C we basically do the following

    ...

    nxt_unit_run(unit_ctx);                                                     
    nxt_unit_done(unit_ctx);

    /* do any clean up here */

    exit(EXIT_SUCCESS);

We sit in nxt_unit_run() until we're told otherwise, e.g if we're told to exit, we come out of that function, call nxt_unit_done(), do any clean up we want then exit(2).

If we don't exit(2) Unit doesn't know we're properly terminated and we leave non-functional application processes around which gummy things up.

ac000 commented 3 months ago

Moved the exit(2) outside the closure

$ git range-diff 3eddef82...cbccf50f
1:  3eddef82 ! 1:  cbccf50f Wasm-wc: Fix application restarts
    @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
              let config = GLOBAL_CONFIG.get().unwrap();
              let state = GlobalState::new(&config)
     @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
    -         bindings::nxt_unit_run(unit_ctx);
              bindings::nxt_unit_done(unit_ctx);

    --        Ok(())
    -+        exit(EXIT_SUCCESS);
    -     })
    +         Ok(())
    +-    })
    ++    });
    ++
    ++    exit(EXIT_SUCCESS);
      }

    + unsafe fn handle_result(
callahad commented 3 months ago

It might be easier to talk through this live -- let me know if you want to hop on Zoom -- but I'll take a crack at this here.

Calling exit immediately terminates the process, skipping any destructors on the stack. If we're fine with an unclean shutdown here (I think we are but want to verify), then this approach is all good.

One consequence of exit is that the function never returns. Which means our signature of fn start(...) -> bindings::nxt_int_t is incorrect and should not specify a return type.

Lastly, instead of unconditionally exiting with EXIT_SUCCESS, I think we should exit with the result of handle_result, which will either be NXT_OK or NXT_ERROR as an nxt_int_t.

ac000 commented 3 months ago

Calling exit immediately terminates the process, skipping any destructors on the stack. If we're fine with an unclean shutdown here (I think we are but want to verify), then this approach is all good.

Well I don't know about this specific case, in the wasm module my start function is this

static nxt_int_t                                                                
nxt_wasm_start(nxt_task_t *task, nxt_process_data_t *data)                      
{                                                                               
    nxt_int_t              ret;                                                 
    nxt_unit_ctx_t         *unit_ctx;                                           
    nxt_unit_init_t        wasm_init;                                           
    nxt_common_app_conf_t  *conf;                                               

    conf = data->app;                                                           

    ret = nxt_unit_default_init(task, &wasm_init, conf);                        
    if (nxt_slow_path(ret != NXT_OK)) {                                         
        nxt_alert(task, "nxt_unit_default_init() failed");                      
        return ret;                                                             
    }                                                                           

    wasm_init.callbacks.request_handler = nxt_wasm_request_handler;             

    unit_ctx = nxt_unit_init(&wasm_init);                                       
    if (nxt_slow_path(unit_ctx == NULL)) {                                      
        return NXT_ERROR;                                                       
    }                                                                           

    NXT_WASM_DO_HOOK(NXT_WASM_FH_MODULE_INIT);                                  
    nxt_unit_run(unit_ctx);                                                     
    nxt_unit_done(unit_ctx);                                                    
    NXT_WASM_DO_HOOK(NXT_WASM_FH_MODULE_END);                                   

    if (nxt_wasm_ctx.dirs != NULL) {                                            
        char  **p;                                                              

        for (p = nxt_wasm_ctx.dirs; *p != NULL; p++) {                          
            nxt_free(*p);                                                       
        }                                                                       
        nxt_free(nxt_wasm_ctx.dirs);                                            
    }                                                                           

    nxt_wops->destroy(&nxt_wasm_ctx);                                           

    exit(EXIT_SUCCESS);                                                         
}

The only actually required bit there after the nxt_unit_done(unit_ctx); is the NXT_WASM_DO_HOOK(NXT_WASM_FH_MODULE_END); the rest is just being nice, freeing memory, which will be freed at exit(3) anyway...

Maybe the rust code could do something similar or maybe it does its own clean up or the whole thing is moot because we're exit(3)ing...

One consequence of exit is that the function never returns. Which means our signature of fn start(...) -> bindings::nxt_int_t is incorrect and should not specify return type.

As I've mentioned, it isn't incorrect. That is the function signature in the module API and as you can see in the function above we can return errors from it.

Lastly, instead of unconditionally exiting with EXIT_SUCCESS, I think we should exit with the result of handle_result, which will either be NXT_OK or NXT_ERROR as an nxt_int_t

I'm not sure about that, if anything you might use the result from nxt_unit_run(), some modules do, some don't.

Application process termination is handled by nxt_proto_sigchld_handler() (in src/nxt_application.c)

From that I don't see that the exit(3) status actually makes much difference. The only place we use WEXITSTATUS(status)) is in log messages...

For example here's how the PHP language module's start function ends

static nxt_int_t                                                                
nxt_php_start(nxt_task_t *task, nxt_process_data_t *data)
{
...
    nxt_unit_run(nxt_php_unit_ctx);                                             
    nxt_unit_done(nxt_php_unit_ctx);                                            

    exit(0);                                                                    

    return NXT_OK;
}

Yes, pointless return...

ac000 commented 3 months ago

Use the return value of nxt_unit_run() as the exit(3) status

$ git range-diff --creation-factor=62 cbccf50f...510925b7 # edited
1:  cbccf50f ! 1:  510925b7 Wasm-wc: Fix application restarts
    @@ Commit message
         The application would become unresponsive.

         What was happening was the old application process(es) weren't
    -    exit(2)ing and so while we were starting new application processes, the
    +    exit(3)ing and so while we were starting new application processes, the
         old ones were still hanging around in a non-functioning state.

    -    When we are terminating an application it must call exit(2).
    +    When we are terminating an application it must call exit(3).

    -    So that's what we do.
    -
    -    We want to use the standard EXIT_SUCCESS mnemonic, which in rust is
    -    available in external crates such as libc, however rather than bringing
    -    in a whole nother crate just for this, we just define it locally
    -    ourselves.
    +    So that's what we do. We use the return value of nxt_unit_run() as the
    +    exit status.

         Reported-by: Liam Crilly <liam@nginx.com>
         Fixes: 20ada4b5c ("Wasm-wc: Core of initial Wasm component model language module support")
    @@ Commit message
         Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

      ## src/wasm-wasi-component/src/lib.rs ##
    +@@ src/wasm-wasi-component/src/lib.rs: use http_body_util::combinators::BoxBody;
      use http_body_util::{BodyExt, Full};
    -+use std::ffi::{c_int, CStr, CString};
    + use std::ffi::{CStr, CString};
      use std::mem::MaybeUninit;
      use std::ptr;
    @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
          task: *mut bindings::nxt_task_t,
          data: *mut bindings::nxt_process_data_t,
      ) -> bindings::nxt_int_t {
    -+    const EXIT_SUCCESS: c_int = 0;
    ++    let mut rc: bindings::nxt_int_t = 0;
     +
          handle_result(task, || {
              let config = GLOBAL_CONFIG.get().unwrap();
              let state = GlobalState::new(&config)
     @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
                  bail!("nxt_unit_init() failed");
              }

    +-        bindings::nxt_unit_run(unit_ctx);
    ++        rc = bindings::nxt_unit_run(unit_ctx);
              bindings::nxt_unit_done(unit_ctx);

              Ok(())
     -    })
     +    });
     +
    -+    exit(EXIT_SUCCESS);
    ++    exit(rc);
      }

      unsafe fn handle_result(
callahad commented 3 months ago

Okay, so we can early return an nxt_int_t, but it's otherwise fine to run to completion and exit?

Before this patch, start() would always return the result of handle_result which is either NXT_OK or NXT_ERROR.

After this patch, start() will exit with success if handle_result calls bail!(), otherwise, it will exit with the return value of bindings::nxt_unit_run. The return value of handle_result is ignored.

With your first version of this patch, start() would occasionally early return with NXT_ERROR, but would otherwise call exit.

What's the intended behavior?

(Edit: Updated to reflect the revision you just pushed)

ac000 commented 3 months ago

Okay, so we can early return an nxt_int_t, but it's otherwise fine to run to completion and exit?

Yes, once we return from nxt_unit_run() we (the application) are done, the process needs to exit, it's much simpler for that to be done by the application process itself.

Before this patch, start() would always return the result of handle_result which is either NXT_OK or NXT_ERROR .

Which is precisely where the issue lies, these processes were not terminating in some situations such as when being sent a QUIT message upon receiving an API restart command.

OTOH if you ^C'd unit then they would terminate properly due to exit(3) being called from nxt_runtime_exit() via a completely different code path.

After this patch, start() will always exit, silently ignoring the return value of handle_result.

Yes. we've been told to terminate, there's nothing expecting a return value at this point from the applications start function.

If all goes well, this function doesn't return...

With your original version of this patch, start() will occasionally early return with NXT_ERROR , but will otherwise call exit.

Maybe that's my lack of rust knowledge, but I don't think that's true. AFAIK the patch that used EXIT_SUCCESS had the same behaviour as the latest version of patch except we now use the return value of nxt_unit_run() as the exit(3) status code.

What's the intended behavior?

To exit(3) the application process when we're told.

ac000 commented 3 months ago

With your original version of this patch, start() will occasionally early return with NXT_ERROR , but will otherwise call exit.

Maybe that's my lack of rust knowledge, but I don't think that's true.

Oh, right, with the bail!() macros.

That hasn't changed, but if we get as far as calling nxt_unit_run() then we don't return from this function as as intended.

callahad commented 3 months ago

Bingo :) (I missed the bail!()'s on my first read, too)

ac000 commented 3 months ago

Fix a build issue on arm64. nxt_unit_run() returns an int not an nxt_int_t...

$ git range-diff 510925b7...c443aac4
1:  510925b7 ! 1:  c443aac4 Wasm-wc: Fix application restarts
    @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
          task: *mut bindings::nxt_task_t,
          data: *mut bindings::nxt_process_data_t,
      ) -> bindings::nxt_int_t {
    -+    let mut rc: bindings::nxt_int_t = 0;
    ++    let mut rc: i32 = 0;
     +
          handle_result(task, || {
              let config = GLOBAL_CONFIG.get().unwrap();
lcrilly commented 3 months ago

This looks different. Is it part of the patch or is it resulting from another commit?

Liam's MacBook:~/configs/unit/wasm/unit$ make
  VER    build/include/nxt_version.h (NXT_VERSION)
  VER    build/include/nxt_version.h (NXT_VERNUM)
  CC     build/src/nxt_lib.o
  CC     build/src/nxt_gmtime.o
…
ac000 commented 3 months ago

This looks different. Is it part of the patch or is it resulting from another commit?

   Liam's MacBook:~/configs/unit/wasm/unit$ make
     VER    build/include/nxt_version.h (NXT_VERSION)
     VER    build/include/nxt_version.h (NXT_VERNUM)
     CC     build/src/nxt_lib.o
     CC     build/src/nxt_gmtime.o
     …  

Yes, this is from other recent work.

You can view the verbose make output with

$ make V=1 ...

(or by passing --debug=print to make on newer versions)

callahad commented 3 months ago

Do we still want to call exit(0) when handle_result bails? Or do we want to exit(-1)?

callahad commented 3 months ago

...or even return -1?

ac000 commented 3 months ago

It should be returning -1 on error.

I don't know what bail!() does, but it certainly doesn't seem to be retuning -1 and is causing Unit to go nuts!

If I return -1 at the top of the start() function, things fail and clean up cleanly.

callahad commented 3 months ago

I can tell I'm really selling you on Rust here ;)

Looking just at the call to handle_result on line 107. That will evaluate to NXT_ERROR if we hit any of the bail!() paths. Otherwise, it will set rc to the result of nxt_unit_run and evaluate to NXT_OK.

But we're not capturing that value at all. Which means even if we hit a bail!, then the outer function body will continue on and hit exit(rc). And rc will still be 0 from its initialization.

I think we want would be closer to:

diff --git a/src/wasm-wasi-component/src/lib.rs b/src/wasm-wasi-component/src/lib.rs
index 3ee40c4f..1b5171a2 100644
--- a/src/wasm-wasi-component/src/lib.rs
+++ b/src/wasm-wasi-component/src/lib.rs
@@ -101,7 +101,9 @@ unsafe extern "C" fn start(
     task: *mut bindings::nxt_task_t,
     data: *mut bindings::nxt_process_data_t,
 ) -> bindings::nxt_int_t {
-    handle_result(task, || {
+    let mut rc: i32 = 0;
+
+    let result = handle_result(task, || {
         let config = GLOBAL_CONFIG.get().unwrap();
         let state = GlobalState::new(&config)
             .context("failed to create initial state")?;
@@ -123,11 +125,19 @@ unsafe extern "C" fn start(
             bail!("nxt_unit_init() failed");
         }

-        bindings::nxt_unit_run(unit_ctx);
+        rc = bindings::nxt_unit_run(unit_ctx);
         bindings::nxt_unit_done(unit_ctx);

         Ok(())
-    })
+    });
+
+    if (result != bindings::NXT_OK as bindings::nxt_int_t) {
+        return result;
+    } else if (rc != 0) {
+        return rc;
+    }
+
+    exit(0);
 }

 unsafe fn handle_result(

...but this could be totally wrong in several ways (haven't actually checked to see if it'd compile, and it's ugly enough that there must be a better way to do it. Lastly, it's not clear if we should return or exit with the value of nxt_unit_run when it fails.)

ac000 commented 3 months ago

clear if we should return or exit with the value of nxt_unit_run when it fails.)

AFAIK and looking at other modules once you've called nxt_unit_run() the only thing to do is exit(3) either with or without the return value of nxt_unit_run() as the status code.

Side note: You can call nxt_unit_run() from multiple threads, in which case in those threads you return NULL, but that's simply because the thread as created by pthread_create(3) has a function prototype of

void *(*start_routine)(void *);

and you do the exit(3) from the main thread.

After some testing...

If I simply return bindings::NXT_ERROR; after the closure (returning 0 gives the original broken behaviour). Then things do work although I get the following in the log after hitting /control/applications/echo-request/restart

2024/03/13 15:32:23 [alert] 140691#140691 recvmsg(8, 2) failed (9: Bad file descriptor)
2024/03/13 15:32:34 [alert] 140691#140691 close(15) failed (9: Bad file descriptor)
2024/03/13 15:32:34 [alert] 140690#140690 sendmsg(12, -1, -1, 1) failed (32: Broken pipe)
2024/03/13 15:32:34 [alert] 140691#140691 close(8) failed (9: Bad file descriptor)
2024/03/13 15:32:34 [alert] 140691#140691 close(12) failed (9: Bad file descriptor)
2024/03/13 15:32:34 [alert] 140691#140691 close(13) failed (9: Bad file descriptor)
2024/03/13 15:32:34 [notice] 140690#140690 app process 140691 exited with code 1
2024/03/13 15:32:34 [notice] 140686#140686 process 140690 exited with code 0
2024/03/13 15:32:34 [info] 140737#140737 "echo-request" prototype started
2024/03/13 15:32:34 [info] 140741#140741 "echo-request" application started

As opposed to if I call exit(3)

2024/03/13 15:35:41 [alert] 141040#141040 sendmsg(12, -1, -1, 1) failed (32: Broken pipe)
2024/03/13 15:35:41 [notice] 141040#141040 app process 141041 exited with code 0
2024/03/13 15:35:41 [notice] 141036#141036 process 141040 exited with code 0
2024/03/13 15:35:41 [info] 141064#141064 "echo-request" prototype started
2024/03/13 15:35:41 [info] 141069#141069 "echo-request" application started

Testing your patch by forcing a bail!() in the closure, we end up in an infinite loop of core dumps

2024/03/13 15:38:38 [info] 143454#143454 "echo-request" application started
2024/03/13 15:38:38 [alert] 143454#143454 nxt_unit_init() failed
2024/03/13 15:38:38 [alert] 141247#141247 app process 143454 exited on signal 11 (core dumped)
2024/03/13 15:38:38 [info] 143460#143460 "echo-request" application started
2024/03/13 15:38:38 [alert] 143460#143460 nxt_unit_init() failed
2024/03/13 15:38:38 [alert] 141247#141247 app process 143460 exited on signal 11 (core dumped)
2024/03/13 15:38:38 [info] 143466#143466 "echo-request" application started
2024/03/13 15:38:38 [alert] 143466#143466 nxt_unit_init() failed
2024/03/13 15:38:38 [alert] 141247#141247 app process 143466 exited on signal 11 (core dumped)

So it seems to me

1) bail!() is still problematic (what about assert!() ?) 2) You can return NXT_ERROR before calling nxt_unit_run() but after calling it, exit(3)ing is the right thing to do (as per our existing modules)

callahad commented 3 months ago

Okay, so it sounds like smeantically we want something more like

if (result != bindings::NXT_OK as bindings::nxt_int_t) {
    return result;
}

exit(rc)

But that our handling of bail!() has maybe always been problematic?

ac000 commented 3 months ago

Unfortunately that still results in coredump city...

Just for clarity this is what I tested

diff --git a/src/wasm-wasi-component/src/lib.rs b/src/wasm-wasi-component/src/lib.rs
index ed35024a..0df88ff5 100644
--- a/src/wasm-wasi-component/src/lib.rs
+++ b/src/wasm-wasi-component/src/lib.rs
@@ -104,7 +104,7 @@ unsafe extern "C" fn start(
 ) -> bindings::nxt_int_t {
     let mut rc: i32 = 0;

-    handle_result(task, || {
+    let result = handle_result(task, || {
         let config = GLOBAL_CONFIG.get().unwrap();
         let state = GlobalState::new(&config)
             .context("failed to create initial state")?;
@@ -122,6 +122,7 @@ unsafe extern "C" fn start(
         wasm_init.callbacks.request_handler = Some(request_handler);

         let unit_ctx = bindings::nxt_unit_init(&mut wasm_init);
+        bail!("nxt_unit_init() failed");
         if unit_ctx.is_null() {
             bail!("nxt_unit_init() failed");
         }
@@ -132,6 +133,12 @@ unsafe extern "C" fn start(
         Ok(())
     });

+    if result != bindings::NXT_OK as bindings::nxt_int_t {
+        println!("Bailed returning ({})", result);
+        return result;
+    }
+
+    println!("Calling exit()");
     exit(rc);
 }

Resulting in the following on the console

...
2024/03/13 21:46:19 [info] 174243#174243 unit 1.33.0 started
Bailed returning (-1)
Bailed returning (-1)
Bailed returning (-1)
Bailed returning (-1)
Bailed returning (-1)
Bailed returning (-1)
...

and a bunch of coredumps... and to clarify, it's the application process that's coredumping.

To double check I tried doing something similar in the original wasm module, i.e returning NXT_ERROR before calling nxt_unit_run()

diff --git a/src/wasm/nxt_wasm.c b/src/wasm/nxt_wasm.c
index 92ed57ab..6dce70ec 100644
--- a/src/wasm/nxt_wasm.c
+++ b/src/wasm/nxt_wasm.c
@@ -213,6 +213,9 @@ nxt_wasm_start(nxt_task_t *task, nxt_process_data_t *data)

     wasm_init.callbacks.request_handler = nxt_wasm_request_handler;

+    printf("%s: Bailing out...\n", __func__);
+    return NXT_ERROR;
+
     unit_ctx = nxt_unit_init(&wasm_init);
     if (nxt_slow_path(unit_ctx == NULL)) {
         return NXT_ERROR;
2024/03/13 21:57:40 [info] 174935#174935 unit 1.33.0 started
nxt_wasm_start: Bailing out...

Everything works as expected.

callahad commented 3 months ago

Do you get coredump city with current master, with no changes other than forcing the bail?

ac000 commented 3 months ago

Yes I do...

callahad commented 3 months ago

Okay, so that part's never worked. Merging this PR is strictly better than the status quo (fixes restarts), but failure cases are still broken.

Do you want to tackle those in this PR, or in a followup?

ac000 commented 3 months ago

I just realised I did the return in the wrong place in the original module, I should have put it after the call to nxt_unit_init(). If I do that then I get the coredumps...

So I think the bails are OK as long as you don't do it after a successful call to nxt_unit_init() and we just need to do that check you added for whether we return or exit(3).

I will cook this up as a separate patch with you as the author as this fixes a separate issue...

(The application process probably shouldn't be core dumping, but that's another issue entirely...)

This is the patch of yours I'm proposing...

diff --git a/src/wasm-wasi-component/src/lib.rs b/src/wasm-wasi-component/src/lib.rs
index ed35024a..b0552e81 100644
--- a/src/wasm-wasi-component/src/lib.rs
+++ b/src/wasm-wasi-component/src/lib.rs
@@ -104,7 +104,7 @@ unsafe extern "C" fn start(
 ) -> bindings::nxt_int_t {
     let mut rc: i32 = 0;

-    handle_result(task, || {
+    let result = handle_result(task, || {
         let config = GLOBAL_CONFIG.get().unwrap();
         let state = GlobalState::new(&config)
             .context("failed to create initial state")?;
@@ -132,6 +132,10 @@ unsafe extern "C" fn start(
         Ok(())
     });

+    if result != bindings::NXT_OK as bindings::nxt_int_t {
+        return result;
+    }
+
     exit(rc);
 }
callahad commented 3 months ago

Sounds like a plan

ac000 commented 3 months ago

@callahad

I decided to just fold that patch into the main one as it is really only needed due to this patch.

Permission to add your

Co-developed-by: Dan Callahan <d.callahan@f5.com>
Signed-off-by: Dan Callahan <d.callahan@f5.com>
ac000 commented 3 months ago

Probably helps if I actually push the changes!

$ git range-diff c443aac4...7ab4851b
1:  c443aac4 ! 1:  7ab4851b Wasm-wc: Fix application restarts
    @@ Commit message
         So that's what we do. We use the return value of nxt_unit_run() as the
         exit status.

    +    Due to exit(3)ing we also need to now explicitly handle the return on
    +    error case.
    +
         Reported-by: Liam Crilly <liam@nginx.com>
         Fixes: 20ada4b5c ("Wasm-wc: Core of initial Wasm component model language module support")
         Closes: https://github.com/nginx/unit/issues/1179
    @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
          task: *mut bindings::nxt_task_t,
          data: *mut bindings::nxt_process_data_t,
      ) -> bindings::nxt_int_t {
    +-    handle_result(task, || {
     +    let mut rc: i32 = 0;
     +
    -     handle_result(task, || {
    ++    let result = handle_result(task, || {
              let config = GLOBAL_CONFIG.get().unwrap();
              let state = GlobalState::new(&config)
    +             .context("failed to create initial state")?;
     @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
                  bail!("nxt_unit_init() failed");
              }
    @@ src/wasm-wasi-component/src/lib.rs: unsafe extern "C" fn start(
     -    })
     +    });
     +
    ++    if result != bindings::NXT_OK as bindings::nxt_int_t {
    ++        return result;
    ++    }
    ++
     +    exit(rc);
      }
callahad commented 3 months ago

@callahad

I decided to just fold that patch into the main one as it is really only needed due to this patch.

Permission to add your

Co-developed-by: Dan Callahan <d.callahan@f5.com>
Signed-off-by: Dan Callahan <d.callahan@f5.com>

Makes sense; go for it 👍

ac000 commented 3 months ago

Add Dan's tags

$ git range-diff 7ab4851b...b74b5bbf
1:  7ab4851b ! 1:  b74b5bbf Wasm-wc: Fix application restarts
    @@ Commit message
         Closes: https://github.com/nginx/unit/issues/1179
         Tested-by: Liam Crilly <liam@nginx.com>
         Tested-by: Danielle De Leo <d.deleo@f5.com>
    +    Co-developed-by: Dan Callahan <d.callahan@f5.com>
    +    Signed-off-by: Dan Callahan <d.callahan@f5.com>
         Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

      ## src/wasm-wasi-component/src/lib.rs ##
ac000 commented 3 months ago

Rebase with master

$ git range-diff b74b5bbf...e75f8d5d
-:  -------- > 1:  9993814d NJS: loader should be registered using njs_vm_set_module_loader()
-:  -------- > 2:  0716b0c7 Tests: NJS cacheable variables with access log
-:  -------- > 3:  dc16a7bc Add a repostatus badge to the README
-:  -------- > 4:  7472a2ca Add a GitHub workflow status badge for our CI to the README
-:  -------- > 5:  b65e49c5 Build with -std=gnu11 (C11 with GNU extensions)
1:  4c033098 = 6:  a8cfea8b Rebuild wasm-wasi-component when any of its dependencies change
2:  b74b5bbf = 7:  e75f8d5d Wasm-wc: Fix application restarts