bytecodealliance / javy

JS to WebAssembly toolchain
Apache License 2.0
2.06k stars 100 forks source link

Example running using Node.js as a host #649

Closed asprouse closed 1 month ago

asprouse commented 1 month ago

Background

I am looking for a way to run user generated javascript securely in a Node.js host (not Node.js code in QuickJS). We have considered node:vm based solutions like https://github.com/laverdet/isolated-vm but most of these libraries are not maintained and warn against running untrusted user code. Ideally I want to generate a WASM module with multiple exports, one for each user defined function.

Question

Is there a basic example of how to run a Javy generated module in a Node.js host? Is this possible with node's built-in WASI support?

jeffcharles commented 1 month ago

Is there a basic example of how to run a Javy generated module in a Node.js host?

There isn't one presently. I'm certainly open to a contribution adding one.

Is this possible with node's built-in WASI support?

I haven't tested it but it should be. At the moment, Javy modules don't have any non-WASI function imports.

asprouse commented 1 month ago

There isn't one presently. I'm certainly open to a contribution adding one.

I am happy to contribute once I have something working.

asprouse commented 1 month ago

@jeffcharles I created an example https://github.com/asprouse/javy-nodejs-example

It is based on the sample code from this repo's readme

I am having an issue with reading the input from stdin. I get the following:

% npm start    

> javy-nodejs-example@0.0.1 start
> node --no-warnings=ExperimentalWarning host.mjs

Error RuntimeError: unreachable
    at wasm://wasm/003c2686:wasm-function[109]:0x61516
    at wasm://wasm/003c2686:wasm-function[169]:0x739d0
    at wasm://wasm/003c2686:wasm-function[1050]:0xbf6de
    at WASI.start (node:wasi:136:7)
    at run (file:///Users/als/code/javy-nodejs-example/host.mjs:51:10)
    at async file:///Users/als/code/javy-nodejs-example/host.mjs:69:18

I think the issue is with readInput. I am able to comment readInput out and run writeOutput({n: 101}) and my program will run correctly. Any idea how I can debug this?

jeffcharles commented 1 month ago

Is there any output printed to the stderr or stdout files when you see the unreachable Wasm trap? If there is, that could indicate what the error in JavaScript was.

We usually use Wasmtime and it definitely outputs error messages that we emit from inside the Wasm. You should be able to just run something like echo '{"n": 100}' | wasmtime embedded.wasm. If that runs successfully, then that hints that the problem is likely to be with host.mjs. If that doesn't run successfully, then that indicates the problem is likely with embedded.js.

Another approach if there is not output in the stderr or stdout files and you don't want to use Wasmtime, is you could try inserting calls to console.error and verifying you see output in the stderr file. If the first thing in embedded.js is a call to console.error and no output is emitted to the stderr file, then that would indicate a problem with how the host has set up forwarding to stderr. If that does work, then you could use a series of calls to console.error with different outputs to identify where your program terminates. Unfortunately, given this is an interpreter running inside a Wasm VM, there isn't a good step debugging option at the present time.

asprouse commented 1 month ago

Is there any output printed to the stderr or stdout files when you see the unreachable Wasm trap? If there is, that could indicate what the error in JavaScript was.

We usually use Wasmtime and it definitely outputs error messages that we emit from inside the Wasm. You should be able to just run something like echo '{"n": 100}' | wasmtime embedded.wasm. If that runs successfully, then that hints that the problem is likely to be with host.mjs. If that doesn't run successfully, then that indicates the problem is likely with embedded.js.

Another approach if there is not output in the stderr or stdout files and you don't want to use Wasmtime, is you could try inserting calls to console.error and verifying you see output in the stderr file. If the first thing in embedded.js is a call to console.error and no output is emitted to the stderr file, then that would indicate a problem with how the host has set up forwarding to stderr. If that does work, then you could use a series of calls to console.error with different outputs to identify where your program terminates. Unfortunately, given this is an interpreter running inside a Wasm VM, there isn't a good step debugging option at the present time.

Thanks for the advice! I updated the code in host.mjs to throw when there is a runtime error and there is output in stderr. This helped me find the issue. It was how I was opening the stdin file in host.mjs. I have pushed the fix and improvements to https://github.com/asprouse/javy-nodejs-example

The only thing left is to get the dynamic linking working. Once that is in place what is the best way to contribute? A PR with an update to the readme with a link to my repo or we could create a examples/nodejs-host directory in this repo?

jeffcharles commented 1 month ago

I'm happy you got your code working 😄!

Once that is in place what is the best way to contribute?

I was thinking just including a snippet of what you have in host.mjs in a markdown doc in the docs folder should be sufficient. Then we could link to that example in this section of the README.md.

asprouse commented 1 month ago

Ok sounds good I will prepare a PR.

My next issue is with using the dynamic linking feature (-d flag). I was hoping to get this into the documented example as it makes Javy usable in production. I made a branch of my repo where I added my attempt to get it working.

Now I get the error:

TypeError: "instance.exports.memory" property must be a WebAssembly.Memory object
    at setupInstance (node:wasi:43:19)
    at WASI.start (node:wasi:128:5)
    at run (file:///Users/als/code/javy-nodejs-example/host.mjs:63:10)
    at async file:///Users/als/code/javy-nodejs-example/host.mjs:94:18 {
  code: 'ERR_INVALID_ARG_TYPE'
}

This error is thrown when running wasi.start(). When adding a console.log(instance.exports) I get the following:

[Object: null prototype] { _start: [Function: 2] }

When I run the same code but without using -d I get:

[Object: null prototype] {
  memory: Memory [WebAssembly.Memory] {},
  _start: [Function: 1050],
  __main_void: [Function: 1214]
}

Am I missing something to get this working? Thanks again for your continued support.

jeffcharles commented 1 month ago

Could you try not providing the WASI imports when instantiating the wasm value and just provide them for the provider module?

Something like

    const provider = await WebAssembly.compile(
      await readFile(new URL("./provider.wasm", import.meta.url)),
    );

    const wasiImports = wasi.getImportObject();
    const providerInstance = await WebAssembly.instantiate(provider, wasiImports);

    const instance = await WebAssembly.instantiate(wasm, {
      javy_quickjs_provider_v1: providerInstance.exports,
    });

The dynamic module should be sending its bytecode to the provider instance which is what imports the WASI functions.

Not 100% sure that will work but worth a try.

asprouse commented 1 month ago

Could you try not providing the WASI imports when instantiating the wasm value and just provide them for the provider module?

Something like

    const provider = await WebAssembly.compile(
      await readFile(new URL("./provider.wasm", import.meta.url)),
    );

    const wasiImports = wasi.getImportObject();
    const providerInstance = await WebAssembly.instantiate(provider, wasiImports);

    const instance = await WebAssembly.instantiate(wasm, {
      javy_quickjs_provider_v1: providerInstance.exports,
    });

The dynamic module should be sending its bytecode to the provider instance which is what imports the WASI functions.

Not 100% sure that will work but worth a try.

I was able to get this to work by overriding the exports object using Object.defineProperty:

    const providerInstance = await WebAssembly.instantiate(
      providerModule,
      wasi.getImportObject(),
    );

    const instance = await WebAssembly.instantiate(embeddedModule, {
      javy_quickjs_provider_v1: providerInstance.exports,
    });
    const {_start} = instance.exports;

    // Hack to add the memory export from the provider module
    Object.defineProperty(instance, "exports", {
      get() {
        return {
          _start,
          memory: providerInstance.exports.memory,
        };
      },
    });

This feels hacky, is it a limitation in node's api that you cannot specify the memory to use in a more standard way? Or is there a way that Javy's compiler could (should?) re-export memory. I am pretty new to this WASM/WASI stuff I'd be curious what the most "correct" way to do this would be.

jeffcharles commented 1 month ago

This feels hacky, is it a limitation in node's api that you cannot specify the memory to use in a more standard way? Or is there a way that Javy's compiler could (should?) re-export memory. I am pretty new to this WASM/WASI stuff I'd be curious what the most "correct" way to do this would be.

I haven't had a chance to experiment with Node's WASI APIs so I don't have an answer off-hand. Hypothetically we could have the dynamically linked module re-export the memory from the provider. This file uses Walrus to write the module.

It looks like it's the use of WASI.start(instance) that's triggering the error because instance is the embedded module and not the provider and instance isn't a WASI module. I'll have to think about this more.

saulecabrera commented 1 month ago

An alternative, and potentially the right thing to do in this case, is to use the initialize WASI function, given that

Here's a diff of the changes:

diff --git a/host.mjs b/host.mjs
index 4004737..ca64be6 100644
--- a/host.mjs
+++ b/host.mjs
@@ -61,16 +61,22 @@ async function run(wasmFilePath, input) {
     const {_start} = instance.exports;

     // Hack to add the memory export from the provider module
-    Object.defineProperty(instance, "exports", {
-      get() {
-        return {
-          _start,
-          memory: providerInstance.exports.memory,
-        };
-      },
-    });
+    // Object.defineProperty(instance, "exports", {
+    //   get() {
+    //     return {
+    //       _start,
+    //       memory: providerInstance.exports.memory,
+    //     };
+    //   },
+    // });

-    wasi.start(instance);
+    wasi.initialize(providerInstance);
+    _start();
+
+    // Don't invoke through Wasi given that the dynamically linked module
+    // doesn't make use of any WASI APIs.
+    // wasi.start(instance);

     const [out, err] = await Promise.all([
       readOutput(stdoutFilePath),
asprouse commented 1 month ago

An alternative, and potentially the right thing to do in this case, is to use the initialize WASI function, given that

Ok that works and is much cleaner. I have made the change here: https://github.com/asprouse/javy-nodejs-example/blob/1b740577e8c03eacea6cc1bcd66ce2c63b5bdf76/host.mjs#L62-L64

@saulecabrera @jeffcharles What do you think of the current state of the example? Especially feedback on host.mjs before I make the PR to this repo.

Thanks again for all your help 😄 !

saulecabrera commented 1 month ago

In general it looks good to me. I'd suggest updating the example to use instantiateStreaming if possible, as it's generally more performant than compiling and then instantiating. https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/instantiateStreaming_static

Feel free to open a PR and we'd be happy to review in more detail!

saulecabrera commented 1 month ago

I'll go ahead and close this one as https://github.com/bytecodealliance/javy/pull/661 was merged; feel free to reopen if there's anything missing.