justjake / quickjs-emscripten

Safely execute untrusted Javascript in your Javascript, and execute synchronous code that uses async functions
https://www.npmjs.com/package/quickjs-emscripten
Other
1.29k stars 97 forks source link

Suggestion/Feature: upstream web compatible version? #33

Closed twop closed 10 months ago

twop commented 3 years ago

Thanks so much for working on the project!

I want to build a small js playground on the web and I couldn't make quickjs-emscripten work :(

I forked the project and republished as esm-quickjs-emscripten (repo: https://github.com/twop/quickjs-emscripten)

My goals:

I'm not comfortable modifying make files and I'm more familiar with rust/wasm toolchain, thus the quality of my changes might be not superb ^_^

If you think that these changes are valuable I would be happy to collaborate to upstream.

Side note: I published the wasm file to npm, but the required step to make it available by ./quickjs-emscripten-module.wasm is manual and tool dependent (I just copied the wasm file to my dist folder). To my knoweledge emscripten doesn't allow to specify wasm file location (wasm-pack does).

Side note 2: I'm thinking that expanding the api surface area for wasm file itself might get tricky overtime. I usually use a different approach with rust/wasm.

// this is binary representation of a command + required data needed to be executed by wasm
const serializedCmd: Uint8Array = {...} 

// tell wasm that we need that much space to store it
// return result is a VIEW of the wasm memory
const wasmMemView: Uint8Array = wasm.allocate_space(serializedCmd.lenght);

// copy cmd + data to wasm memory
wasmMemView.set(serializedCmd);

// tell wasm that the memory is filled with payload + handle it
// returns VIEW of the wasm memory that holds the result
const responseView: Uint8Array = wasmpApi.handle_message();

// decode the result into something meaningful, like a normal js object
const result = decode(responseView);

One really cool thing about this approach is that js -> wasm boundary is explicit and it can be expensive because of all the bookkeeping that needs to be done with GC and memory management, and this approach helps with that by not using any manual resource management.

Also, it seems that this approach produces the smallest js and wasm.

Happy to chat more if you are curious about it

justjake commented 3 years ago

Thank you for the issue.

It’s helpful to know that the FS issue messes with people. For what it’s worth, I solve this in webpack following this stack overflow question: https://stackoverflow.com/questions/59487224/webpack-throws-error-with-emscripten-cant-resolve-fs

You are encouraging ESM output in the Node module. I would like to do so, but have little experience configuring ESM tools. It would help to set up an ESM example project in ./examples to demonstrate a typical ESM web app solution.

I have some other questions as well:

I published the wasm file to npm, but the required step to make it available by ./quickjs-emscripten-module.wasm is manual and tool dependent (I just copied the wasm file to my dist folder). To my knoweledge emscripten doesn't allow to specify wasm file location (wasm-pack does).

Do you mean you copied the .wasm file into quickjs-emacripten’s dist folder as part of the build process of the npm package, or that the consumer who installs the npm package has to copy the .wasm file as part of their build process? I don’t want consumers to need to copy files or change build tool config to use this package. Is there a work around there?

I'm thinking that expanding the api surface area for wasm file itself might get tricky overtime. I usually use a different approach with rust/wasm.

Can you explain more what advantages you see of building a command struct, vs the function calls that the project currently uses? What is the specific advantage does this post with regard to resource use? Would you expect the library consumer to change significantly? When you say bookkeeping is expensive, are you referring to the Lifetime class?

justjake commented 3 years ago

Goals: • esmodule output ready to be consumed on the web

I understand why this is wanted

• shipped as the latest ecmascript target.

I understand why this is wanted a bit less. I assume, cruft-free and your build tools can handle it

• ship actual .wasm file

This one I understand the least. I would think that avoiding an extra .wasm file actually makes the users life easier. Can you explain this?

twop commented 3 years ago

Let me try to comment on your latest questions

Goals:
• esmodule output ready to be consumed on the web

I understand why this is wanted

It is possible to set up export map and configure the output to be consumed as an es module and cjs. https://docs.skypack.dev/package-authors/package-checks#export-map Has some good info on this.

• shipped as the latest ecmascript target.

I understand why this is wanted a bit less. I assume, cruft-free and your build tools can handle it

According to some studies es5 js target is one of the major perf issues on the web (https://www.youtube.com/watch?v=lFd0tfYWGJ8). Shipping a library as es5 you are making a target choice for the developers, because they can't transpile it back es5 -> es2018 (yet). Also, WASM is a relatively new thing, browsers that support WASM at least support es6 (things like classes)

• ship actual .wasm file

This one I understand the least. I would think that avoiding an extra .wasm file actually makes the users life easier. Can you explain this?

There are several benefits to this:

The downside of having a separate WASM file is that you somehow need to integrate that into your tooling. I don't know what is the right approach there (I don't have wasm libraries published) but wasm-pack has several modes of generating glue code but all of them are leveraging a separate .wasm file: https://rustwasm.github.io/wasm-pack/book/tutorials/npm-browser-packages/building-your-project.html

twop commented 3 years ago

I'm thinking that expanding the api surface area for wasm file itself might get tricky overtime. I usually use a different approach with rust/wasm.

Can you explain more what advantages you see of building a command struct, vs the function calls that the project currently uses? What is the specific advantage does this post with regard to resource use? Would you expect the library consumer to change significantly? When you say bookkeeping is expensive, are you referring to the Lifetime class?

I was mostly referring to the conversion between js and wasm values. This article goes in depth about this issue: https://hacks.mozilla.org/2018/10/calls-between-javascript-and-webassembly-are-finally-fast-%f0%9f%8e%89/

I think the main advantage that I see is that in the model of commands you don't need to store a "handle", at least in some cases.

// before
const world = vm.createString('world')
vm.setProp(vm.global, 'NAME', world)
world.dispose()

//after
vm.setProp(vm.global, 'NAME', vm.newString('world'))

// vm.newString('world') just produces a desc of a string that will be GCollected as usual

If the value world is consumed immediately (which is often the case) then there is no need to manage object lifecycles between two runtimes. The only case (in my opinion at least) when you do need to keep the handle alive when it is important to pass the same reference to different places (like an object or function references). Like so

// pseudocode
const objHandle = vm.newObj({...})
const a =  vm.newFunction('a', () => objHandle));
const b =  vm.newFunction('b', () => objHandle)); // <-- the same reference

// inside vm
if (a() === b()) { ...}

In my experience working with WASM it never came up. Thus, I think transferring object by values is simpler and less error prone.

This is essentially the same difference between Deno and Node. Deno uses a single binary method doStuff(arrayBuffer) to talk between rust and V8. At least that is my understanding based on one of the talks they gave.

Hope that helps.

71 commented 3 years ago

Are there any updates on this? I'd like to use a .wasm/latest-ecmascript version of the library too, and @twop's is a few commits behind this repo (which gets updated regularly, which is great, but which also means that @twop's fork will usually lag behind).

justjake commented 3 years ago

@71 I'd welcome any contributions on these topics. Here's the situation:

.wasm file

I discussed this with @carlosdp a few days ago, he may be working on it already.

I think the model we should follow is how sql.js does this, by providing a locateFile option when initializing the library. See an example here: https://github.com/sql-js/sql.js/#usage which uses the underlying Emscripten feature described here: https://emscripten.org/docs/api_reference/module.html#Module.locateFile

Ecmascript Modules syntax

I like the idea of producing a ESM build as a secondary output included in the package. I think it's possible to provide this using a field in package.json, or even just a separate entrypoint file eg import { getQuickJS } from 'quickjs-emscripten/esm'. I don't know much about this subject; I haven't used ESM yet.

Compatibility

carlosdp commented 3 years ago

I haven't started on this yet, been prioritizing other things, but want to get to it soon. That said, I think maintaining backward-compatibility with current functionality should be fine, at least for my use-case. It won't help with bundle size, sure, but the issue I'm having is with loaded memory, so that would be solved even if the non-locateFile option with the binary embedded still exists.

WebReflection commented 1 year ago

Same error about fs here ... emscripten ships its own FS so it'd be great to avoid needing node:fs entirely and use what other WASM based interpreters use. If you'd like to grab that at runtime here there's just that: https://github.com/WebReflection/fs#readme

justjake commented 1 year ago

I’m happy to accept a PR addressing this issue, but don’t have time to work on it myself.

WebReflection commented 1 year ago

@justjake I might just use this module and re-package it in a way that fs is never used or it's just ignored but beside that I wonder if there's any live demo in-browser that actually works and doesn't have issues with the FS import ... any pointer would do, thank you!

justjake commented 1 year ago

I use nextjs, you can use these webpack settings to fix the issue; most bundlers should have a similar setting: https://stackoverflow.com/questions/59487224/webpack-throws-error-with-emscripten-cant-resolve-fs

There should be a webpack demo in this repo, is that broken?

justjake commented 1 year ago

My website has an online demo, but isn’t open source https://jake.tl/projects/quickjs-emscripten

WebReflection commented 1 year ago

I'd like to have emscripten FS instead, so that files can be pre-fetched, as example, and the code can use import stuff from './path.js' within the evaluated code. If passing those flags to EMSCRIPTEN solves, what does it take to have a build in here that produces also Web ports of QuickJS? A fork could be an answer but it would lack inevitably behind for a reason or another ... would that PR be considered?

justjake commented 1 year ago

If you’re brave enough to get into the Makefile and add it as another build variant I’d be happy to accept. Note that you can already (without Emscripten filesystem) shim import and implement whatever module loading system you need my examples use node FS but you can use a Map etc to implement the module loading code.

WebReflection commented 1 year ago

to provide little extra context, the polyscript (fairly new) project works lovely with Pyodide, MicroPython, but also Wasmoon (Lua) and ruby-wasm-wasi ... and I am pretty sure with a type="quickjs-emscripten" things will be even more interesting for its users, as that unlocks tons of "sandboxed" use cases with extreme ease out of the box. Pyodide and microPython both ship with Emscripten FS and we can pre-fetch files and let them use from module import ... and it works great so that I'd love to see that happening with QuickJS too. End of the fairy-tale :-)

WebReflection commented 1 year ago

@justjake the module is up and running https://www.npmjs.com/package/@webreflection/quickjs-emscripten

something I've observed and the reason I won't file a PR:

Similar to you, I don't have time to provide such PR so I'll just try to keep my fork updated and, on that regard, it looks like your version of uickjs is also outdated so I suggest you run update-quickjs in the future or I'd be merging older version and need to do that myself after rebase.

edit P.S. my fork and branch that builds the module is here: https://github.com/WebReflection/quickjs-emscripten/tree/wasm-web and the module is published from here https://github.com/WebReflection/quickjs-emscripten/tree/wasm-web/esm

justjake commented 1 year ago

Yeah, I tried web-ifying about a year ago but got discouraged wading through all the issues you cited. My takeaway is that if I want the library to “just work” everywhere I need to start bundling out separate ESM and CJS bundles with Rollup or something. At the time the Emscripten module build mode didn’t work in Node either. Maybe these days it’d be possible to keep platform set to everything but turn on module mode and it would be happier, I think Emscripten made some progress on that side. Still it’s a shlep.

Anyways glad you got a fork working without too much trouble. I’ll link to it from my README.

justjake commented 10 months ago

@WebReflection @twop closing - v0.25.0+ now ship ESM builds and work fine via esm.sh CDN. All the included build variants use a separate WASM file, although single file variants are available from NPM separately. See the changelog entry here: https://github.com/justjake/quickjs-emscripten/blob/main/CHANGELOG.md#esmodules--browser-support