justjake / quickjs-emscripten

Safely execute untrusted Javascript in your Javascript, and execute synchronous code that uses async functions
https://www.npmjs.com/package/quickjs-emscripten
Other
1.29k stars 97 forks source link

Preloading modules for performant and isolated evaluation #193

Open jpdenford opened 2 months ago

jpdenford commented 2 months ago

Hey, Firstly thank you for the amazing project. I'm interested in learning more about how to improve performance when loading external modules. I'm very new to wasm in general so please bear with me.

The context is that I'm wanting to preload js modules, then execute lots of short-lived scripts (which themselves don't need to import anything), each script should run in isolation.

I'm loading modules (a minified/bundled version of Intl) into the (js) global context. This is working using the setModuleLoader and allows doing things like Number(1).toLocaleString('en').

However it takes a decent amount of time to parse and load the module (which is a couple of MB) for each script. This approach feels really heavyweight.

Is there a way to preload js modules such that each new context has them available but is still secured?

Some ideas which seem related but I don't yet have the knowledge to pursue are:

I'm finding this difficult to reason about given my lack of understanding of how memory is laid out in wasm and how that relates to this project's concepts of 'context' or 'runtime' isolation.

Would love to hear any suggestions or ideas that you have for this project? Thanks!

justjake commented 2 months ago

quickjs itself provides functions to read and write bytecode, including function bytecode. This is used by the qjsc program to compile JS sources into bytecode that can be included in a C file. See the docs here: https://bellard.org/quickjs/quickjs.html#Executable-generation

We already have a few routines that can load and store "bjson" values, which calls down to JS_ReadObject and JS_WriteObject C functions, which is what implement bytecode I/O, but they aren't setting the flag that allows storing executable code.

I'm not sure to what extent your time is spent on parsing code. If the majority of the time is parsing, then pre-parsing could result in a big speedup.


When it comes to module/runtime/context:

As for WASM memory layout, I'm not sure what to say. I don't understand how Emscripten's malloc and free work or how they interact with the browser WASM runtime.

This is specifically how JSRuntime and JSContext are related:

https://github.com/justjake/quickjs-emscripten/blob/af04b41ece3efd5929f7676a81903b31695a766e/vendor/quickjs/quickjs.c#L2128-L2161

justjake commented 2 months ago
Untitled
jpdenford commented 2 months ago

Thank you for the wealth of info @justjake. I'll take a bit of time to absorb and do some more reading.

From my first read, it sounds like an approach might be to include my module (Intl) in the vendor/quickjs/Makefile (similar to the bjson or bignum) and build the wasm module from sources so that it's already included when quickjs loads.

At that point I can retest by passing that new wasm module as a quickjs-emscripten 'variant' and monitor the initialisation time again to see how it's performing.

Out of interest did you create the diagram or is that from somewhere you could link to (it seems like that could be a good place for me to learn further)?

justjake commented 2 months ago

I drew the diagram in tldraw