Preloading modules for performant and isolated evaluation

jpdenford commented 2 months ago

Hey, Firstly thank you for the amazing project. I'm interested in learning more about how to improve performance when loading external modules. I'm very new to wasm in general so please bear with me.

The context is that I'm wanting to preload js modules, then execute lots of short-lived scripts (which themselves don't need to import anything), each script should run in isolation.

I'm loading modules (a minified/bundled version of Intl) into the (js) global context. This is working using the setModuleLoader and allows doing things like Number(1).toLocaleString('en').

However it takes a decent amount of time to parse and load the module (which is a couple of MB) for each script. This approach feels really heavyweight.

Is there a way to preload js modules such that each new context has them available but is still secured?

Some ideas which seem related but I don't yet have the knowledge to pursue are:

This issue https://github.com/justjake/quickjs-emscripten/issues/152. Namely, preload the module, copy memory then use a new variant and context for each script evaluation. I'm not confident in the security of this approach given the shared pointers or that there isn't an easier way.
Preloading the module/s into the wasm binary itself (as though QuickJS shipped them natively). I assume however that this would still take a fair amount of time when each runtime is created (as it would need to parse and execute the module code e.g. Intl).
Same as the above, but using some kind of memory preloading using https://github.com/bytecodealliance/wizer

I'm finding this difficult to reason about given my lack of understanding of how memory is laid out in wasm and how that relates to this project's concepts of 'context' or 'runtime' isolation.

Would love to hear any suggestions or ideas that you have for this project? Thanks!

justjake commented 2 months ago

quickjs itself provides functions to read and write bytecode, including function bytecode. This is used by the qjsc program to compile JS sources into bytecode that can be included in a C file. See the docs here: https://bellard.org/quickjs/quickjs.html#Executable-generation

We already have a few routines that can load and store "bjson" values, which calls down to JS_ReadObject and JS_WriteObject C functions, which is what implement bytecode I/O, but they aren't setting the flag that allows storing executable code.

I'm not sure to what extent your time is spent on parsing code. If the majority of the time is parsing, then pre-parsing could result in a big speedup.

When it comes to module/runtime/context:

A QuickJSEmscripten "module" wraps a WebAssembly.Instance. It is a isolation boundary provided by the host. Unless you explicitly share memory between WebAssembly.Modules, they are isolated from each other.
QuickJSRuntime and QuickJSContext wrap the *JSRuntime and *JSContext C types. These are C structs allocated inside a WebAssembly's Memory. A Module can contain multiple Runtimes. A Runtime can contain multiple Contexts. You can share objects between Contexts that have the same Runtime (and thus exist in the same Module memory).

As for WASM memory layout, I'm not sure what to say. I don't understand how Emscripten's malloc and free work or how they interact with the browser WASM runtime.

This is specifically how JSRuntime and JSContext are related:

https://github.com/justjake/quickjs-emscripten/blob/af04b41ece3efd5929f7676a81903b31695a766e/vendor/quickjs/quickjs.c#L2128-L2161

justjake commented 2 months ago

jpdenford commented 2 months ago

Thank you for the wealth of info @justjake. I'll take a bit of time to absorb and do some more reading.

From my first read, it sounds like an approach might be to include my module (Intl) in the vendor/quickjs/Makefile (similar to the bjson or bignum) and build the wasm module from sources so that it's already included when quickjs loads.

At that point I can retest by passing that new wasm module as a quickjs-emscripten 'variant' and monitor the initialisation time again to see how it's performing.

Out of interest did you create the diagram or is that from somewhere you could link to (it seems like that could be a good place for me to learn further)?

justjake commented 2 months ago

I drew the diagram in tldraw

justjake / quickjs-emscripten

Preloading modules for performant and isolated evaluation #193