Open joyeecheung opened 5 months ago
Avoid UTF8 transcoding by directly reading the source code as buffer from disk (this needs to dance with CJS loader monkey patching)
FWIW I think loaders/require hooks are probably very common in dev and pretty rare in production (where compile cache has the most value) but that's just intuition.
and pretty rare in production
I would think it's the opposite for tracing agents - although they usually don't care about the source code (except the current loaders built on top of the off-thread hooks like import-in-the-middle that are forced to do a hacky analysis of the source code, which is why I am proposing a in-thread link()
hook for them in https://github.com/nodejs/loaders/pull/198 to not have to do this).
Also, speaking of loader hooks, I think we need to convert the CJS loader to pass buffers around regardless for future binary file loading support (for example if the custom loader wants to support loading wasm, or zip, or anything that's not stored as uh, bytes encoded in UTF8 on disk).
(Now I am spamming this tracking issue but) after some looks into existing monkey patching usages in popular packages (or I did a GitHub code search) I think the most prioritized item should be an API for packages to turn this on programmably. I don't have a great idea about how this API should look like though, so ideas welcomed. (Maybe process.enableCompileCache(dir)
with some re-entrancy guards would be good enough, or maybe it's a terrible idea to make it per-thread because packages can step on each other's toes?).
Other hashing algorithm
xxhash by the creator of ztsd seems a good no-crypto hash algorithm.
Follow up to https://github.com/nodejs/node/issues/47472 . Some items that can be investigated: