bytecodealliance / javy

JS to WebAssembly toolchain
Apache License 2.0
2.1k stars 101 forks source link

fix: JS values being leaked #617

Closed TheEdward162 closed 3 months ago

TheEdward162 commented 4 months ago

Description of the change

This (draft) PR is something of a question whether the javy project would be interested in upstreaming these changes. These changes fix leaking QuickJS objects, which weren't being dropped at all. The cost here is performance (as visible in javy-cli fuel tests).

The biggest API change is JSValueRef which is no longer Copy and not longer really a ref, more like an Rc. The name wasn't changed but maybe that would be approriate as well. Internally it was important to get right the semantics of when QuickJS expects ownership to be moved and when borrowed. Mostly it was browsing QuickJS source and trial and error, but should now be captured correctly.

The changes also involve reworking how callbacks are boxed. Mostly inspired by rquickjs and how they do function boxing. These are necessary to run correct drop functions for each callback, otherwise they were getting leaked as well.

Some changes were also done in quickjs-wasm-sys. Previously there was extensions/value.c file and its counterpart src/extensions/value.rs. The Rust file can be trivially generated into the bindings.rs output by introducing a header file and manual bindings can get out of sync, so we used a header file instead. The changes are about exporting JS_DupValue and JS_FreeValue in one place for our purposes, not relying on a private export by QuickJS itself.

Finally there's a new dependecy https://crates.io/crates/visibility. This is not strictly necessary and could be replaced with some code duplication, but it was introduced to combine the effort done with the export-sys feature with changes around drop. It's failing cargo-vet so please decide if it's something you'd want to accept. Internally it seems to only be one simple proc macro.

Why am I making this change?

We've been using quickjs-wasm-rs for some time now. About 3 months ago we ran into a bug in one of our services was crashing due to QuickJS GC. The issue was that we have a long-lived wasm instance within which we instantiate possibly multiple QuickJS contexts. These contexts internally share a global GC table, so they cannot be used concurrently, but it should be possible to use them sequentially (use one, drop it, create another one). Then we discovered that because quickjs-wasm-rs doesn't free the context nor values they remain in the global GC table. Later, when a different context invokes GC the old values reference a freed context. GC sees these values and assumes either that they are either still in use (leak, because their use count was never lowered to 0) or tries to free them (use after free, since they point to the previous context).

We've been running with this fork for rougly those 3 months and it seems to have resolved the issue. Today I rebased those changes on top of upstream javy and am opening this PR.

Checklist

(will do if these changes are wanted)

saulecabrera commented 4 months ago

Hi @TheEdward162 thanks for the pull-request. The reason for not hooking up GC is performance as you've correctly pointed out. Javy is intended to be used in short-lived programs, where we can take advantage of dropping WebAssembly instances and therefore claiming whatever memory was created during the execution of the program. That to say, we haven't tested Javy (and the bindings) in a different or more complex scenarios, like the one you described.

Regarding the PR itself, you might be interested in this draft PR: https://github.com/bytecodealliance/javy/pull/618; the proposal is to sunset quickjs-wasm-rs and quickjs-wasm-sys with the motivation of having/providing more ergonomic and feature complete bindings through rquickjs. I'm not sure how does your Javy use-case look like exactly (or if you're mostly using the bindings), but I'd assume that in either or scenario, once that PR lands, the issue that you're experimenting will be solved?

TheEdward162 commented 3 months ago

Hey, sorry it took longer to respond. Thanks for your response, this tells me everything I need to know. We are actually only using quickjs-wasm-rs and quickjs-wasm-sys so if the future there is to switch to rquickjs I think that is the most reasonable path forward for us too.

I'll close this PR now as it probably doesn't make sense anymore.