[RFC]: Use v8 cache (or other) to speed up server function load time on startup

twodotsmax commented 1 year ago

Summary

Hot reloading of the API service is too slow for large projects.

Motivation

David left this comment in lambdaLoader.ts 8 months ago:

// TODO: Use v8 caching to load these crazy fast.

As our project scaled, the load time for the /graphql function went to 12 seconds, which actually started to slow down development because when the service code changes, every function must be reloaded.

A CPU profile pointed to lots of time spent in "require", and using the perf-hooks "timerify" function, we were able to find the slowest to load modules and skip/lazy-load them during development.

Detailed proposal

Complete the TODO, if V8 module caching truly is the solution. I tried inserting a require('v8-compile-cache') in lambdaLoader.ts and it had no noticeable effect on hot reload
More abstractly, if I change one line in one service file, it triggers regeneration / reloading of the entire generated graphql.js file. If only the involved service were to be refreshed, the hot reload time would not scale linearly with project growth, and would stay constant.

I am willing to do more investigation, but wanted to hear from the team first about whether v8 module caching can solve this if used correctly.

Are you interested in working on this?

[X] I'm interested in working on this

thedavidprice commented 1 year ago

Caching is something we've been interested in for a long time. I believe @peterp implemented a custom solution at Snaplet. I'll give him a nudge about this.

Thanks in advance @twodotsmax for offering to chip away at next steps!

orta commented 1 year ago

Just adding that this is also one of my issues largest Redwood issues ATM, a server reload is ~7s which is one of the slowest parts of my entire system

At Artsy we made https://github.com/artsy/express-reloadable (writeup: http://artsy.github.io/blog/2017/12/05/Express-Reloadable-Update/ ) which addressed most of our issues. I don't think it works in an ESM world, but until redwood the whole supports ESM apps the techniques in it could work

twodotsmax commented 1 year ago

Steps for people who need a solution now:

Clone the Node.js source code
In lib/internal/perf/timerify.js, wrap the "return result.finally..." case in a try catch that swallows the error, which will happen when you try to measure your module load time due to modules that export promises. I did not want to dig deeper to understand this bug in Node.js, but possibly related: https://github.com/nodejs/node/pull/42883
Compile Node.js and add to path
Follow instructions in https://redwoodjs.com/docs/contributing-walkthrough to load the framework from source

In lambdaLoader.ts, do this, verbatim from the Node.js perf-hooks docs:


'use strict';
const {
performance,
PerformanceObserver,
} = require('node:perf_hooks');
const mod = require('node:module');

// Monkey patch the require function mod.Module.prototype.require = performance.timerify(mod.Module.prototype.require); require = performance.timerify(require);

// Activate the observer const obs = new PerformanceObserver((list) => { const entries = list.getEntries(); entries.forEach((entry) => { console.log(require('${entry[0]}'), entry.duration); }); performance.clearMarks(); performance.clearMeasures(); obs.disconnect(); }); obs.observe({ entryTypes: ['function'], buffered: true });

require('some-module');


6. Postprocess the output so that you only save the first instance of loading a particular module, because the second time will be cached and understate the load time. You can also take the max load time.
7. Sort the module loading data by load time to find your most expensive require() calls.
8. If the expensive module is not necessary to load on startup, replace the import with a require() that does not run at module load time. For example, if you use a library that makes a remote API call, don't require() it at the top of the file. Instead require it when the API call is made, and it will be cached the second time. This will block the event loop and you may need to conditionally perform this lazy load only in development. If it is necessary on startup but isn't needed for development or needed 100% of the time, you can conditionally require it.

orta commented 1 year ago

Interesting (and intense!), from my perspective ~50% of the latency I am seeing is framework related:

Going from detecting a change to a new build: 1.5s
Importing Server Functions: 4s
Being able to receive API calls: ~2s

I guess you're seeing that middle phase continue to grow, I guess maybe trimming the middle might be useful for me in the short term. I had wondered if when the vite has stablized whether vite-node might be a solution to base the API dev server on.

thedavidprice commented 1 year ago

@peterp @twodotsmax @orta if we can determine the path forward here, I'm all in to prioritize the effort. Keep me posted

peterp commented 1 year ago

I've gone down this path a few times and modifying the require cache never feels robust: The main reason is that live-reloaded files, by using the require cache, has unexpected consequences because it's additive. I called this "spooky reloads."

The problem is best illustrated by example code:

Step 1: The development server loads server.ts, adds myFirstFunction to the register and executes myFirstFunction()

+ const myFirstFunction = () => {
+    console.log('called `myFirstFunction`')
+ }
+ myFirstFunction()

Step 2: The user modifies server.ts, removes myFirstFunction, adds mySecondFunction, but mistakenly executes myFirstFunction.

- const myFirstFunction = () => {
-    console.log('called `myFirstFunction`')
- }
- myFirstFunction()

+ const mySecondFunction = () => {
+    console.log('called `mySecondFunction`)
+ }
+ myFirstFunction()

Even though the user deleted myFirstFunction from their code, we reloaded the file and added the second function, but didn't remove the first function, so the result is that myFirstFunction still executes.

The list of problems that this sort of reloading mechanism can introduce is vast and each time a user experiences them it can feel like they're loosing their mind because, as developers, we expect the file on our filesystem to match our expectations in the runtime.

peterp commented 1 year ago

Just to clarify the comment I originally left in the code: // TODO: Use v8 caching to load these crazy fast.

v8 has a mechanism to extract and restore the bytecode that's exposed in NodeJS:

Code caching (also known as bytecode caching) is an important optimization in browsers. It reduces the start-up time of commonly visited websites by caching the result of parsing + compilation.

The idea was to have the "build-server" save the bytecode to disk, and then to only use the saved bytecode in the "dev-server." The hope was that this would improve start-up time, and to avoid the above mentioned issues with invalidating the require cache.

This concept is completely untested. It may be an easy performance win, but it might not be. As far as I know Next uses a form of hot-module-reloading.

Alternative approaches to writing our own code that could be a short-term win:

1.~Snapshots may be a very quick 'n dirty win for the dev-server. We just add these flags to the start-up: build and restore~

Use v8-compile-cache: which attaches a require hook to use V8's code cache to speed up instantiation time. The "code cache" is the work of parsing and compiling done by V8.
Figure out a solution that invalidates the old variables on the window object, aka: proper hot module reloading.

Edit: Removed option 1 since it wouldn't allow us to inject new code.

peterp commented 1 year ago

As an aside I wanted to point out that the lazy loading technique which we used in the CLI could also help you figure out why cold starts are slow (which modules are taking the most time) and then you could import them lazily.

This RFC shows how we figured out what was taking long to import, and how we improved it: https://github.com/redwoodjs/redwood/issues/6027

Happy to do a video on 0x if someone needs additional guidance.

thedavidprice commented 1 year ago

@twodotsmax @orta checking in about possible interest + availability to dig into this. Understood either way. Just didn't want to miss the chance if either of you is interested.

redwoodjs / redwood