patriksimek / vm2

Advanced vm/sandbox for Node.js
MIT License
3.86k stars 293 forks source link

Any tips for improving performance of `vm.run()`? #514

Closed theinterned closed 1 year ago

theinterned commented 1 year ago

Hello,

I'm hoping you can help me with this question. I'm trying to optimize the performance of calling vm.run().

I have an SSR server where I want to create a new context for every render. I am using VM2 to create a sandbox for my render. However I am noticing that state is leaking (eg variables declared in module-level scope) between renders. I need render requests isolation, and have been looking at creating a new vm context in which to run the script for every render request.

This looks something like:

let script;

function render(args) {
  script ??= new VMScript(`module.exports = require('${handlerPath}')`) // this takes a couple ms and can be cached between renders
  const vm = new NodeVM(options) // this takes a couple ms. I recreate this on every render to not leak state between renders
  const handler = vm.run(script) // this takes > 1s!!! And needs to be done on every render
  return handler(args)
}

I think the obvious place I'm looking to optimize is in vm.run. So I am looking for any advice about how to optimize this. Alternatively I would be open to any other patterns that would ensure that state (globals, module scope vars etc) don't get shared between renders in a more performant way.

Thank you in advance for any help you can provide!

XmiliaH commented 1 year ago

There is currently no way to improve this as require files are not cached. You can try the following hack which gave me a 10% speed boost.

const options = {
    require: { external: true, root: './'}
};
const script = new VMScript(`module.exports = require('uuid');`);
const { packageCache, scriptCache } = new NodeVM(options)._resolver ?? {};

function render(args) {
    const vm = new NodeVM(options);
    try {
        Object.assign(vm._resolver ?? {}, { packageCache, scriptCache });
    } catch { /* This is currently a hack! */ }
    const handler = vm.run(script);
    return handler(args);
}

This shares the internal package and script caches for require between the instances. Getting the performance better than this will likely not be possible.

theinterned commented 1 year ago

Oh thanks wow 🤩! I will give this a try. Will sharing the packageCache and scriptCache leak any data between vms?

XmiliaH commented 1 year ago

They will not leak any data between vms. It only caches the compiled VMScripts for modules so that hopefully require will be a bit faster.

theinterned commented 1 year ago

As a follow up: @XmiliaH , your optimization has been a HUGE perf win for us. Using pretty much exactly your code, I am able to get the time to create a fresh render context from ~1.5s all the way down to ~100ms! 🚀

I think this makes sense as the script we're requiring in our VM is in fact an entire webpack bundle of react code, so avoiding re-linking everything is a major win.

Thank you so much this is awesome!

One question I have about this: this _resolver definitely seems to be a private API: it's not exposed on the NodeVM type. Would you consider changing this and making a public API for this?

XmiliaH commented 1 year ago

Thanks for the feedback. Yes _resolver is a private API. I will look into making a public API for this.

XmiliaH commented 1 year ago

@theinterned Could you take a look at https://github.com/patriksimek/vm2/pull/519 if it would fit your usecase.

theinterned commented 1 year ago

Oh #519 looks very interesting! I ran out of time to try it out this week, but I'll mess around with it next week for sure. Thanks you! ⭐⭐⭐⭐⭐

manuelpuyol commented 1 year ago

hey @XmiliaH, do you know if the _resolver cache would also impact dynamic imports?

XmiliaH commented 1 year ago

VM2 does not allow for es modules, so there is no dynamic import.