justjake / quickjs-emscripten

Safely execute untrusted Javascript in your Javascript, and execute synchronous code that uses async functions
https://www.npmjs.com/package/quickjs-emscripten
Other
1.18k stars 94 forks source link

Loading WasmMemory into module #152

Open ppedziwiatr opened 4 months ago

ppedziwiatr commented 4 months ago

Hey,

as a follow-up of https://github.com/justjake/quickjs-emscripten/issues/146 - here's code that I'm trying to run:

import { getQuickJS, newQuickJSWASMModule, newVariant, RELEASE_SYNC } from "quickjs-emscripten";

async function main() {
  // module 1
  const QuickJS1 = await getQuickJS();
  const vm1 = QuickJS1.newContext();
  const res1 = vm1.evalCode(`const x = 100;
  function test() {
    return x;
  };
  `);
  res1.value.dispose();

  const testRes = vm1.unwrapResult(vm1.evalCode(`test()`))
  console.log("test result:", vm1.getNumber(testRes));
  const mem1 = QuickJS1.getWasmMemory();
  vm1.dispose();

  // module 2
  const variant = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1
  });
  const QuickJS2 = await newQuickJSWASMModule(variant);
  const vm2 = QuickJS2.newContext();

  // getting " 'test' is not defined" here..
  const testRes2 = vm2.unwrapResult(vm2.evalCode(`test()`))
  console.log("test result:", vm2.getNumber(testRes2));
  testRes2.dispose();
  vm2.dispose();
}

main().catch(e => console.error(e)).finally();

What it does - it simply creates one quickjs module, evals some code, stores the module's Wasm memory - and then a second module is created with a Wasm memory from the first one. I would expect that all the code evaluated in the first one will be available in the second - but that's not the case. The result is:

test result: 100
y [ReferenceError]: 'test' is not defined
    at <eval> (eval.js)
Host: QuickJSUnwrapError
    at T.unwrapResult (file:///Users/ppe/warp/warp/node_modules/quickjs-emscripten-core/dist/chunk-VABBOY7Z.mjs:4:12568)
    at main (file:///Users/ppe/warp/warp/tools/quickjs-memory.mjs:26:24) {
  cause: {
    name: 'ReferenceError',
    message: "'test' is not defined",
    stack: '    at <eval> (eval.js)\n'
  },

Is this expected behaviour? If so, is it possible to somehow save a state of a module and resume the evaluation later with this exact state?

ppedziwiatr commented 4 months ago

One more experiment:

  const mem1 = new WebAssembly.Memory({
    initial: 256, //*65536
    maximum: 256  //*65536
  });
  const variant1 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1
  });
  const QuickJS1 = await newQuickJSWASMModule(variant1);
  const vm1 = QuickJS1.newContext();
  const res1 = vm1.evalCode(`const x = 100;
  function test() {
    return x;
  };
  `);
  res1.value.dispose();

  const testRes = vm1.unwrapResult(vm1.evalCode(`test()`));
  console.log("test result:", vm1.getNumber(testRes));
  const mem1Output = QuickJS1.getWasmMemory();
  console.log("buffers equal?", equal(mem1.buffer, mem1Output.buffer));

  function equal(buf1, buf2) {
    if (buf1.byteLength != buf2.byteLength) return false;
    const dv1 = new Uint8Array(buf1);
    const dv2 = new Uint8Array(buf2);
    for (let i = 0; i != buf1.byteLength; i++) {
      if (dv1[i] != dv2[i]) return false;
    }
    return true;
  }

- it seems the the initial buffer (mem1) and the buffer after evaluating some JS code (mem1Output) are exactly the same - I'm not sure if the getWasmMemory() works properly..

justjake commented 4 months ago

In your first example, you create two VMs. Each VM instance is independent and has its own environment, even within the same WebAssembly memory. If you run the same logic inside one We Assembly module, you will get the same result. You need to re-use the memory address of vm1 in the cloned WebAssembly module so that you are using vm1, instead of creating vm2.

it seems the the initial buffer (mem1 and the buffer after evaluating some JS code mem1Output are exactly the same

If I understand correctly, I think this would be expected because these two objects are the same object. I think it works similar to this:

const mem1 = []
const qjs1 = { mem: mem1 }
qjs1.mem.push(‘vm1’)
_.equal(mem1, qjs1.mem) // true
mem1 === qjs1.mem // also true
justjake commented 4 months ago

You can get the memory address of various objects like QuickJSContext, QuickJSRuntime by inspecting their private internals looking for pointer types. You’ll need to use the private constructor to create clones of the objects with the same address but referencing the new WebAssembly module via existing private APIs.

ppedziwiatr commented 4 months ago

If I understand correctly, I think this would be expected because these two objects are the same object. I think it works similar to this:

const mem1 = []
const qjs1 = { mem: mem1 }
qjs1.mem.push(‘vm1’)
_.equal(mem1, qjs1.mem) // true
mem1 === qjs1.mem // also true

Yeah, I've just realized that I should copy the original memory's buffer and compare it with a buffer from the getWasmMemory - and then they indeed are not equal.

ppedziwiatr commented 4 months ago

You can get the memory address of various objects like QuickJSContext, QuickJSRuntime by inspecting their private internals looking for pointer types. You’ll need to use the private constructor to create clones of the objects with the same address but referencing the new WebAssembly module via existing private APIs.

hmm...this looks a bit complicated to me...any help here would appreciated, though I obviously understand that you have better things to do ;)

ppedziwiatr commented 4 months ago

ok, here's what I've got so far

// module 1
  const mem1 = new WebAssembly.Memory({
    initial: 256, //*65536
    maximum: 256  //*65536
  });
  const variant1 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1
  });
  const QuickJS1 = await newQuickJSWASMModule(variant1);
  const vm1 = QuickJS1.newContext();
  const res1 = vm1.evalCode(`const x = 100;
  function test() {
    return x;
  };
  `);
  res1.value.dispose();
  const testRes = vm1.unwrapResult(vm1.evalCode(`test()`));
  console.log("test result:", vm1.getNumber(testRes));
  const mem1Output = QuickJS1.getWasmMemory();
  testRes.dispose();

  // module 2
  const variant2 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1Output
  });
  const QuickJS2 = await newQuickJSWASMModule(variant2);
  const runtime = QuickJS2.newRuntime();
  console.log("runtime created");
  console.log("Creating context with memory pointing to", vm1.ctx.value);
  const ctx = new Lifetime(
    vm1.ctx.value, // pointer to the memory of vm1
    undefined,
    (ctx_ptr) => {
      this.contextMap.delete(ctx_ptr);
      this.callbacks.deleteContext(ctx_ptr);
      this.ffi.QTS_FreeContext(ctx_ptr);
    }
  );
  console.log("lifetime created");
  vm1.dispose();

  // it seems to explode here...
  const vm2 = new QuickJSContext({
    module: QuickJS2,
    ctx,
    ffi: runtime.ffi,
    rt: runtime.rt,
    ownedLifetimes: [runtime],
    runtime,
    callbacks: runtime.callbacks
  });
  runtime.context = vm2;
  console.log("context created");

with the output being:

  test result: 100
runtime created
Creating context with memory pointing to 5326448
lifetime created
RuntimeError: table index is out of bounds
    at wasm://wasm/001e8b7e:wasm-function[687]:0x34871
    at wasm://wasm/001e8b7e:wasm-function[955]:0x4e69c
    at c._QTS_FreeRuntime [as QTS_FreeRuntime] (file:///Users/ppe/warp/warp/node_modules/@jitl/quickjs-wasmfile-release-sync/dist/emscripten-module.mjs:26:353)
    at s.disposer (file:///Users/ppe/warp/warp/node_modules/quickjs-emscripten-core/dist/chunk-VABBOY7Z.mjs:6:1211)
    at s.dispose (file:///Users/ppe/warp/warp/node_modules/quickjs-emscripten-core/dist/chunk-VABBOY7Z.mjs:1:2414)
    at s.dispose (file:///Users/ppe/warp/warp/node_modules/quickjs-emscripten-core/dist/chunk-VABBOY7Z.mjs:4:708)

Not sure if I can create a completely new runtime and attach a context that points to the memory of the old context - or maybe the runtime should be also pointing to the memory of the old runtime?

ppedziwiatr commented 4 months ago

Another try - by passing pointer to the context in the ContextOptions

// module 1
  const mem1 = new WebAssembly.Memory({
    initial: 256, //*65536
    maximum: 256  //*65536
  });
  const variant1 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1
  });
  const QuickJS1 = await newQuickJSWASMModule(variant1);
  const vm1 = QuickJS1.newContext();
  const res1 = vm1.evalCode(`const x = 100;
  function test() {
    return x;
  };
  `);
  res1.value.dispose();
  const testRes = vm1.unwrapResult(vm1.evalCode(`test()`));
  console.log("test result:", vm1.getNumber(testRes));
  testRes.dispose();
  const vm1Ptr = vm1.ctx.value;
  vm1.dispose();
  const mem1Output = QuickJS1.getWasmMemory();

  // module 2
  const variant2 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1Output
  });
  const QuickJS2 = await newQuickJSWASMModule(variant2);
  const runtime = QuickJS2.newRuntime();
  console.log('vm1Ptr', vm1Ptr)
  const vm2 = runtime.newContext({
    contextPointer: vm1Ptr
  });
  console.log("context created");
  const testRes2 = vm2.unwrapResult(vm2.evalCode(`test()`));

- but with a similar result:

  test result: 100
vm1Ptr 5326448
context created
RuntimeError: table index is out of bounds
    at wasm://wasm/001e8b7e:wasm-function[776]:0x3f33a
    at wasm://wasm/001e8b7e:wasm-function[824]:0x446b8
    at wasm://wasm/001e8b7e:wasm-function[1182]:0x5c2cb
ppedziwiatr commented 4 months ago

One last try - with creating both runtime and context from pointers:

// module 1
  const mem1 = new WebAssembly.Memory({
    initial: 256, //*65536
    maximum: 256  //*65536
  });
  const variant1 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1
  });
  const QuickJS1 = await newQuickJSWASMModule(variant1);
  const vm1 = QuickJS1.newContext();
  const res1 = vm1.evalCode(`const x = 100;
  function test() {
    return x;
  };
  `);
  res1.value.dispose();
  const testRes = vm1.unwrapResult(vm1.evalCode(`test()`));
  console.log("test result:", vm1.getNumber(testRes));
  testRes.dispose();
  const vm1Ptr = vm1.ctx.value;
  const rt1Ptr = vm1.rt.value;
  vm1.dispose();
  const mem1Output = QuickJS1.getWasmMemory();

  // module 2
  const variant2 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1Output
  });
  const QuickJS2 = await newQuickJSWASMModule(variant2);
  console.log("vm1Ptr", vm1Ptr);
  console.log("rt1Ptr", rt1Ptr);
  const rt = new Lifetime(rt1Ptr, undefined, (rt_ptr) => {
    this.callbacks.deleteRuntime(rt_ptr)
    this.ffi.QTS_FreeRuntime(rt_ptr)
  })
  const runtime = new QuickJSRuntime({
    module: QuickJS2.module,
    callbacks: QuickJS2.callbacks,
    ffi: QuickJS2.ffi,
    rt
  });
  console.log("runtime created");
  // applyBaseRuntimeOptions(runtime, options)

  const vm2 = runtime.newContext({
    contextPointer: vm1Ptr
  });
  console.log("context created");
  vm2.evalCode(`test()`);

with output:

  test result: 100
vm1Ptr 5326448
rt1Ptr 5324520
runtime created
context created
RuntimeError: table index is out of bounds
    at wasm://wasm/001e8b7e:wasm-function[74]:0x3729
    at wasm://wasm/001e8b7e:wasm-function[539]:0x27f06
    at wasm://wasm/001e8b7e:wasm-function[953]:0x4deb5
    at wasm://wasm/001e8b7e:wasm-function[776]:0x3f33a
    at wasm://wasm/001e8b7e:wasm-function[824]:0x446b8

Any suggestions would be great :)

I've also run the code with the debug variant, i.e. import variant from "@jitl/quickjs-singlefile-mjs-debug-sync"

- it produces this output:

test result: 100
vm1Ptr 5326448
rt1Ptr 5324520
runtime created
context created
quickjs-emscripten: QTS_Eval: Detected module = false
RuntimeError: table index is out of bounds
    at 585 (wasm://wasm/01940eaa:wasm-function[619]:0x69b66)
    at 584 (wasm://wasm/01940eaa:wasm-function[618]:0x69a27)
    at 586 (wasm://wasm/01940eaa:wasm-function[620]:0x69c08)
    at 56 (wasm://wasm/01940eaa:wasm-function[90]:0x63c8)
    at file:///Users/ppe/warp/warp/node_modules/@jitl/quickjs-singlefile-mjs-debug-sync/dist/emscripten-module.mjs:829:14
    at ccall (file:///Users/ppe/warp/warp/node_modules/@jitl/quickjs-singlefile-mjs-debug-sync/dist/emscripten-module.mjs:4855:22)
    at e.QTS_Eval (file:///Users/ppe/warp/warp/node_modules/@jitl/quickjs-singlefile-mjs-debug-sync/dist/emscripten-module.mjs:4872:16)
    at file:///Users/ppe/warp/warp/node_modules/quickjs-emscripten-core/dist/chunk-VABBOY7Z.mjs:4:11422
    at s.consume (file:///Users/ppe/warp/warp/node_modules/quickjs-emscripten-core/dist/chunk-VABBOY7Z.mjs:1:2333)
    at T.evalCode (file:///Users/ppe/warp/warp/node_modules/quickjs-emscripten-core/dist/chunk-VABBOY7Z.mjs:4:11402)
justjake commented 4 months ago

Once you do vm1.dispose, the memory address is free'd. You can't use vm1 again after it's disposed. Maybe try leaving off all the .dispose calls and see if that helps. Beyond that, a few more suggestions:

you can use xxd mem1.bin > mem1.hex to turn a binary file into a hexdump of that file, then something like diff -u or your diff tool of choice to compare the two hex files, like diff -u mem1.hex mem2.hex

ppedziwiatr commented 4 months ago

Once you do vm1.dispose, the memory address is free'd. You can't use vm1 again after it's disposed. Maybe try leaving off all the .dispose calls and see if that helps.

Yup, that's why I'm disposing vm1 after storing the pointer values:

const vm1Ptr = vm1.ctx.value;
const rt1Ptr = vm1.rt.value;
vm1.dispose();

const mem1Output = QuickJS1.getWasmMemory();

  // module 2
  const variant2 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1Output
  });

- but I'll try to remove all dispose calls according to your suggestions.

What I've managed to make working so far is a second vm created from the pointer of the first vm, but within the same runtime and module:

 const mem1 = new WebAssembly.Memory({
    initial: 256, //*65536
    maximum: 512  //*65536
  });
  const variant1 = newVariant(RELEASE_SYNC, {
    wasmMemory: mem1
  });
  const QuickJS1 = await newQuickJSWASMModule(variant1);
  const runtime = QuickJS1.newRuntime();

  // vm1
  const vm1 = runtime.newContext();
  const res1 = vm1.evalCode(`const x = 100;
  function test() {
    return x;
  };
  `);
  res1.value.dispose();
  const testRes = vm1.unwrapResult(vm1.evalCode(`test()`));
  console.log("test result:", vm1.getNumber(testRes));
  testRes.dispose();
  const vm1Ptr = vm1.ctx.value;
  const rt1Ptr = vm1.rt.value;
  vm1.dispose();

  // vm2 from vm1 ptr
  const vm2 = runtime.newContext({
    contextPointer: vm1Ptr
  });
  const testRes2 = vm2.unwrapResult(vm2.evalCode(`test()`));
  console.log("test result 2:", vm2.getNumber(testRes2));

I'll keep on experimenting, thanks for your suggestions!

ppedziwiatr commented 4 months ago

Ok, I think I've managed to make it work, at least on the most simple, PoC-level example. Pasting here in case someone would need to do sth similar.

  1. quickjs-memory-store.mjs - this script creates a memory, runtime and context, evaluates some js code, calls this code and then dumps wasm module memory into a file. It also stores pointers to the runtime and the context.
    
    import { newQuickJSWASMModule, newVariant } from "quickjs-emscripten";

import releaseSyncVariant from "@jitl/quickjs-singlefile-mjs-release-sync"; import fs from "fs";

async function main() { // module 1 const mem1 = new WebAssembly.Memory({ initial: 256, //65536 maximum: 2048 //65536 }); const variant1 = newVariant(releaseSyncVariant, { wasmMemory: mem1 }); const QuickJS1 = await newQuickJSWASMModule(variant1);

// runtime 1 const runtime1 = QuickJS1.newRuntime();

// vm1 const vm1 = runtime1.newContext(); const res1 = vm1.evalCode(let x = 100; function add() { x += 50; return x; }; ); res1.value.dispose(); const testRes = vm1.unwrapResult(vm1.evalCode(add())); console.log("add result (should be 150):", vm1.getNumber(testRes)); testRes.dispose();

// storing vm1 and runtime 1 pointers const vm1Ptr = vm1.ctx.value; const rt1Ptr = vm1.rt.value; console.log({ vm1Ptr, rt1Ptr }); fs.writeFileSync("ptrs.json", JSON.stringify({ vm1Ptr, rt1Ptr }));

// storing module 1 memory in file const buffer = QuickJS1.getWasmMemory().buffer; fs.writeFileSync("wasmMem.dat", new Uint8Array(buffer));

// it is now safe to dispose vm1 and runtime1 vm1.dispose(); runtime1.dispose(); }

main().catch(e => console.error(e)).finally();

2. `quickjs-memory-read.mjs` - it reads the wasm memory from file and copies it into newly created `WebAssembly.Memory` instance. It then creates a new quickjs module, runtime and context (from the pointers stored in the first script) and calls the previously defined JS function, checking whether state was properly preserved.

```js
import {
  Lifetime,
  newQuickJSWASMModule,
  newVariant,
  QuickJSRuntime,
  RELEASE_SYNC
} from "quickjs-emscripten";

import debugSyncVariant from "@jitl/quickjs-singlefile-mjs-debug-sync";
import releaseSyncVariant from "@jitl/quickjs-singlefile-mjs-release-sync";
import fs from "fs";

async function main() {
  // reading memory from file, creating new Memory instance
  // and copying contents of the first module's memory into it
  const memoryBuffer = fs.readFileSync("wasmMem.dat");
  const existingBufferView = new Uint8Array(memoryBuffer);
  const pageSize = 64 * 1024;
  const numPages = Math.ceil(memoryBuffer.byteLength / pageSize);
  const newWasmMemory = new WebAssembly.Memory({
    initial: numPages,
    maximum: 2048
  });
  const newWasmMemoryView = new Uint8Array(newWasmMemory.buffer);
  newWasmMemoryView.set(existingBufferView);

  // module 2
  const variant2 = newVariant(releaseSyncVariant, {
    wasmMemory: newWasmMemory
  });

  const { rt1Ptr, vm1Ptr } = JSON.parse(fs.readFileSync("ptrs.json", "utf-8"));

  const QuickJS2 = await newQuickJSWASMModule(variant2);

  // creating runtime from rt1Ptr pointer
  const rt = new Lifetime(rt1Ptr, undefined, (rt_ptr) => {
    QuickJS2.callbacks.deleteRuntime(rt_ptr)
    QuickJS2.ffi.QTS_FreeRuntime(rt_ptr)
  })
  const runtime2 = new QuickJSRuntime({
    module: QuickJS2.module,
    callbacks: QuickJS2.callbacks,
    ffi: QuickJS2.ffi,
    rt
  });

  // creating context from vm1 ptr
  const vm2 = runtime2.newContext({
    contextPointer: vm1Ptr
  });

  const testRes2 = vm2.unwrapResult(vm2.evalCode(`add()`));
  console.log("add result 2 (should be 200):", vm2.getNumber(testRes2));
  testRes2.dispose();
  vm2.dispose();
  runtime2.dispose();
}

main().catch(e => console.error(e)).finally();

Few notes:

  1. for some reason the minimum memory is ~16mb (i.e. the initialvalue has to be set at minimum 256 ( * 65536 = ~16mb) - I'm not sure I understand why. Trying to set lower value ends with: RuntimeError: Aborted(LinkError: WebAssembly.instantiate(): memory import 15 is smaller than initial 256, got 255). Build with -sASSERTIONS for more info.

  2. the above example also works properly with debug-sync variant, i.e.: @jitl/quickjs-singlefile-mjs-debug-sync - but obviusly the same variant has to be used for both store and read scripts ;)

Thanks @justjake for your help and all your suggestions!

EDIT - one last note - the issue with wasm memory dump size can be easily 'fixed' with compression:

  1. compression:

    const buffer = QuickJS1.getWasmMemory().buffer;
    
    const compressionStream = new CompressionStream('gzip');
    const uint8Buffer = new Uint8Array(buffer);
    const stream = new ReadableStream({
    start(controller) {
      controller.enqueue(uint8Buffer);
      controller.close();
    },
    });
    const compressedStream = stream.pipeThrough(compressionStream);
    const compressedBuffer = await new Response(compressedStream).arrayBuffer();
    
    fs.writeFileSync("wasmMem.dat", new Uint8Array(compressedBuffer));
    1. deompression:
      
      const compressedBuffer = fs.readFileSync("wasmMem.dat");
      const compressedBufferView = new Uint8Array(compressedBuffer);
      const decompressionStream = new DecompressionStream('gzip');
      const compressedStream = new ReadableStream({
      start(controller) {
      controller.enqueue(compressedBufferView);
      controller.close();
      },
      });
      const decompressedStream = compressedStream.pipeThrough(decompressionStream);
      const decompressedBuffer = await new Response(decompressedStream).arrayBuffer();
      const memoryBuffer = new Uint8Array(decompressedBuffer);

    const pageSize = 64 * 1024; const numPages = Math.ceil(memoryBuffer.byteLength / pageSize); const newWasmMemory = new WebAssembly.Memory({ initial: numPages, maximum: 2048 }); const newWasmMemoryView = new Uint8Array(newWasmMemory.buffer); newWasmMemoryView.set(memoryBuffer);

    // module 2 const variant2 = newVariant(releaseSyncVariant, { wasmMemory: newWasmMemory });

    
    
    In case of the above example it reduces memory size from ~16mb to ~87kb.