Open jonathantneal opened 5 years ago
It seems that using the vm module is the only comfortable choice.
@marxangels How can I use the vm module to invalidate the cache?
@marxangels How can I use the vm module to invalidate the cache?
Run all the code in a vm context so the module cache is under your control.
According to @cspotcode in the following discussion here, there are some concerns regarding the use of vm modules for such a scenario as well:
https://github.com/mochajs/mocha/pull/4855#issuecomment-1077818595
A simple module-level hot-reload for my express web application with less than 200 lines of code.
express-hot-reload.js if someone needs it.
This comment was marked as off-topic ???
This demo shows how to use the vm
module and control your own module cache.
Never thought it was so difficult for coders to communicate...well, let's make it clearer.
It is obviously not difficult for the Nodejs core team to provide such a cache-delete interface. But why not?
You can't guarantee to solve the dependency problem between modules at all levels, and specific control of specific application scenarios is required. Using the vm
module is the only choice.
End! Fck! Bye Bye!
A simple module-level hot-reload for my express web application with less than 200 lines of code.
express-hot-reload.js if someone needs it.
Thanks for taking the time to share this @marxangels. It is very much on-topic and I, personally, appreciate it.
The amount of time I have spent total on this is just saddening.
If only cache invalidation was exposed - I still don't understand why it isn't :cry:
A simple module-level hot-reload for my express web application with less than 200 lines of code.
express-hot-reload.js if someone needs it.
I don't fully understand how this works, and it gives me errors that I don't understand either
From line await module.evaluate();
I'm getting:
/user/project/node_modules/@hapi/hoek/lib/error.js:23
Error.captureStackTrace(this, exports.assert);
^
Error: regex must be a RegExp
at new module.exports (/user/project/node_modules/@hapi/hoek/lib/error.js:23:19)
at module.exports (/user/project/node_modules/@hapi/hoek/lib/assert.js:21:11)
at internals.Base.method (/user/project/node_modules/joi/lib/types/string.js:531:17)
at /user/project/src/resources/formats.js:15:21
at SourceTextModule.evaluate (node:internal/vm/module:226:23)
at SrcModule (file:///user/project/test/express-hot-reload.js:91:16)
at async linker (file:///user/project/test/express-hot-reload.js:146:20)
at async ModuleWrap.<anonymous> (node:internal/vm/module:315:24)
at async Promise.all (index 9)
at async SourceTextModule.<computed> (node:internal/vm/module:333:11)
Are there any resources available that detail how this all works?
Are there any resources available that detail how this all works?
The only way I know of to achieve this currently is cache busting. This article discusses it: https://dev.to/giltayar/mock-all-you-want-supporting-es-modules-in-the-testdouble-js-mocking-library-3gh1
If there’s a way to achieve it via vm
, that would be great; it someone wants to open a PR to implement import { unload } from 'module'
and/or import { replace } from 'module'
(for individual module replacing/reloading) I would be happy to review one. This is absolutely a problem that the Node core devs would love to solve, but we’ve been flummoxed by V8 not providing an easy API for module unloading. If someone can find another way, or add the missing feature to V8, we would greatly appreciate it.
Just wanted to chime in with my own use case(s) here as another advocate for this feature as part of a website builder tool I work on.
export
a page component and so needed this kind of cache busting for local development so when I change my code and a live reload is triggered in the browser, I would be able to see my changes with the new content.Initially I tried the query string technique but for some reason couldn't get it work back then, but as I was rendering Web Components on the server side, and thus had a little shim for some of the DOM / Browser features I needed, I opted for a Worker thread because it felt safer to shield all that away from the rest of the global runtime.
const routeModuleLocation = path.join(pagesDir, routeFilename);
const routeWorkerUrl = `...`;
const html = await new Promise((resolve, reject) => {
const worker = new Worker(routeWorkerUrl);
worker.on('message', (result) => {
const { html } = result;
resolve(html);
});
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0) {
reject(new Error(`Worker stopped with exit code ${code}`));
}
});
worker.postMessage({
...
});
});
I've just encountered another scenario for this same similar file based routing approach, this time for API endpoints instead of pages, that just export
a handler
function and return a Response
object. (think for serverless and edge functions). Again for local development, I would like to be able to edit my code and see changes when live reloading. However, for this a Worker seemed like overkill and this time, I was able to get the simpler cache busting technique to work. (hurrah!)
let href = new URL(apiRoute, `file://${apisDir}`).href; // apiRoute -> /api/greeting.js
if (isDevMode) {
href = `${resolvedUrl}?t=${Date.now()}`;
}
const { handler } = await import(href);
const req = new Request(new URL(`https://localhost:1984${apiRoute}`));
const res = await handler(req);
// ...
So mostly all to say:
Thanks everyone and appreciate all the hard work! ✌️
My current workaround is the following: create a temporary copy of the file with a hashed name, import it and delete it. It's a dirty solution, but it worked for me and maybe it will help somebody else.
export async function importFresh(modulePath) {
const filepath = path.resolve(modulePath);
const fileContent = await fs.promises.readFile(filepath, "utf8");
const ext = path.extname(filepath);
const extRegex = new RegExp(`\\${ext}$`);
const newFilepath = `${filepath.replace(extRegex, "")}${Date.now()}${ext}`;
await fs.promises.writeFile(newFilepath, fileContent);
const module = await import(newFilepath);
fs.unlink(newFilepath, () => {});
return module;
}
EDIT: I was using the method .trimEnd(".ts")
before, but trimEnd
doesn't even accept arguments. As a result, it was generating files named as example.ts123456789.ts
. It was still working, but I've fixed that and also changed the code to accept any extension, so that .js
files will also work.
Is there any Estimated Time when this feature will be released?
Is there any Estimated Time when this feature will be released?
Pull requests are welcome!
You can evict modules by exposing internals:
import { createRequire } from "node:module";
// `log.mjs` just contains `console.log("wow");`
import "./log.mjs";
const evictModule = function() {
try {
const require = createRequire(import.meta.url);
const loader = require("internal/process/esm_loader");
const { loadCache } = loader.esmLoader;
if (loadCache) {
return url => {
if (loadCache.has(url)) {
loadCache.delete(url);
return true;
} else {
return false;
}
};
}
} catch {}
}();
console.log(evictModule?.(import.meta.resolve("./log.mjs")));
await import("./log.mjs");
$ node -v
v20.8.0
$ node --expose-internals main.mjs
wow
true
wow
node does keep references stored in another map that's more deeply buried so the evicted modules don't actually get garbage collected. But if your intention is to re-import then it's worth a shot.
On the topic of ESM hot reloading I made a pretty sophisticated HMR --loader
here: https://github.com/braidnetworks/dynohot
It's loosely compatible with esm-hmr, Vite, and Webpack APIs. There are probably differences in execution order because we're supporting top-level await and promise-returning accept
, dispose
, & prune
handlers. My team has been using it for a couple of months now with really good results.
You can evict modules by exposing internals:
import { createRequire } from "node:module"; // `log.mjs` just contains `console.log("wow");` import "./log.mjs"; const evictModule = function() { const require = createRequire(import.meta.url); const loader = require("internal/process/esm_loader"); const { loadCache } = loader.esmLoader; if (loadCache) { return url => { if (loadCache.has(url)) { loadCache.delete(url); return true; } else { return false; } }; } }(); console.log(evictModule?.(import.meta.resolve("./log.mjs"))); await import("./log.mjs");
$ node -v v20.8.0 $ node --expose-internals main.mjs wow true wow
node does keep references stored in another map that's more deeply buried so the evicted modules don't actually get garbage collected. But if your intention is to re-import then it's worth a shot.
On the topic of ESM hot reloading I made a pretty sophisticated HMR
--loader
here: https://github.com/braidnetworks/dynohotIt's loosely compatible with esm-hmr, Vite, and Webpack APIs. There are probably differences in execution order because we're supporting top-level await and promise-returning
accept
,dispose
, &prune
handlers. My team has been using it for a couple of months now with really good results.
So is this essentially deleting a pointer to the real module loaded in V8? What's actually stored in that map?
So is this essentially deleting a pointer to the real module loaded in V8? What's actually stored in that map?
That's not really how V8 works. You can delete a handle to a value, and then it's V8's job to garbage collect the "pointer" at some point when it feels the vibes are good. Regardless, this is all plain old JavaScript. The require
in my example just pulls in this internal module: https://github.com/nodejs/node/blob/1dc0667aa6096f10c5f95471dfe27e78db1dafd5/lib/internal/process/esm_loader.js
We just so happened to luck out that they've internally exported the loader "cache" [I think cache is not the best name for this map because the contents of the record affect correctness and not performance]. It's been a while since I looked at this code and it changed recently [053511f7eca7cf50233abb10e7d88588aea6fc93]. And like I said this is not the only reference that nodejs holds onto the module, so V8 will not collect the garbage of a module deleted in this manner; once imported its in the heap forever. This is not a limitation of V8 it's just a consequence of nodejs's implementation, and is not a hard thing to change. We know it's possible to collect stale modules because vm
and isolated-vm
) do it.
Implementing this feature in a blessed way in nodejs is not difficult but doing so may be in direct violation of the specification:
If this operation is called multiple times with the same (referrer, specifier) pair and it performs FinishLoadingImportedModule(referrer, specifier, payload, result) where result is a normal completion, then it must perform FinishLoadingImportedModule(referrer, specifier, payload, result) with the same result each time.
That might be ok because nodejs has lots of power tools that alter the fabric of reality.
What I don't like about delete require.cache[key]
and my sample above is that they provide the capability for any module to remove any other module. That was ok during the complete anarchy of CommonJS but ESM should probably hold itself to a higher standard. I haven't thought about this much at all, and this is an idea off the top of my head, but someone might consider implementing something like import.meta.releaseSelf()
which would allow a module to release itself from the module graph. In the case of an error during dynamic import you could define the release function as a property on the error object itself [and all consumers of an errored dynamic module would need to release]. That would isolate the powerful capability to known sites. Or maybe it belongs as an import attribute i.e. import('./maybe.mjs', { with: { weak: true } })
, and is only allowed on dynamic imports.
The loadCache
you’re referring to exists to be spec compliant to the part of the spec you quoted. It’s also more efficient to load a module from memory rather than from disk on subsequent times that it’s imported, but the spec compliance was the primary motivator. This cache is separate from the modules loaded into V8; those are contained within V8 in its own memory.
It’s been a long time since I looked into this, but my understanding from way back when was that there is no way to delete or replace an ES module once it’s been loaded into V8. There exist some debugging protocol methods to do things along those lines but they haven’t been extended to ESM (at least as of a few years ago; it would be great if I’m wrong about this now). Hence the existing methods of hot module reload that involve wrappers around modules and using query strings, that get the job done (see Vite for a great example) but slowly use more and more memory the longer your dev server runs because the replaced older versions of each module never get deleted from memory.
If you can find a way to purge old ES modules from V8, that would be a wonderful discovery. I don’t think we would be concerned with making such an API available to users; the entire module customization hooks API is available to users, and it allows whatever spec violations the users want (it’s not like running CoffeeScript is spec compliant, but those are the types of use cases that the hooks enable). Node aims to be spec compliant by default, but not to block users from customizing it to behave in noncompliant ways if desired.
It’s been a long time since I looked into this, but my understanding from way back when was that there is no way to delete or replace an ES module once it’s been loaded into V8.
I have a decent bit of experience here since I was very much in the weeds on this while working on isolated-vm. v8 doesn't treat modules much differently than plain Object
allocations. They are HeapObject
instances which can be garbage collected like anything else. Otherwise Chrome would need to continually allocate new isolates on the same origin as you browse different pages on a website. I'd love to see evidence to the contrary but Eternal
module handles would be antithetical to the v8 design philosophy.
nodejs demonstrates this collectability, today, with the vm
module:
import { memoryUsage } from "node:process";
import { SourceTextModule } from "node:vm";
for (let ii = 0; ; ++ii) {
const module = new SourceTextModule(`
// allocate and materialize 1mb uint8 array
export const uint8 = new Uint8Array(1024 * 1024);
for (let ii = 0; ii < uint8.length; ii += 4096) {
uint8[ii] = 1;
}`, { identifier: "file:///module0" });
await module.link(() => {});
await module.evaluate();
// quickly stabilizes around 40mb. other statistics are stable as well.
console.log(ii, memoryUsage().heapTotal >> 20);
}
node --experimental-vm-modules test.mjs
0 3
1 3
// [...]
122570 37
122571 37
// [...]
184255 38
184256 38
Actually, and this is surprising, my evictModule
example does garbage collect the evicted modules. This wasn't true the last time I looked into it (v20.1.0) so recent changes to nodejs have removed the deeper reference. Sample code:
chunk.mjs:
// allocate and materialize 1mb uint8 array
export const uint8 = new Uint8Array(1024 * 1024);
for (let ii = 0; ii < uint8.length; ii += 4096) {
uint8[ii] = 1;
}
test.mjs
import { createRequire } from "node:module";
const evictModule = function() {
try {
const require = createRequire(import.meta.url);
const loader = require("internal/process/esm_loader");
const { loadCache } = loader.esmLoader;
if (loadCache) {
return url => {
if (loadCache.has(url)) {
loadCache.delete(url);
return true;
} else {
return false;
}
};
}
} catch {}
}();
const register = new FinalizationRegistry(name => {
console.log("collected", name);
});
for (let ii = 0; ; ++ii) {
const { uint8 } = await import("./chunk.mjs");
register.register(uint8, ii);
evictModule?.(import.meta.resolve("./chunk.mjs"));
}
Results:
-> % node --expose-internals test.mjs
collected 63
collected 62
collected 61
collected 60
// [and so on]
So the question of whether or not this is possible has an answer: it is possible.
I don’t think we would be concerned with making such an API available to users; the entire module customization hooks API is available to users
The only way to implement this functionality as a loader would be to create the entire module graph within vm
. This is a non-starter because it is not currently possible to invoke the loader chain programmatically.
Anyway, I would clearly be interested in having a feature like this, but I also can't come up with something that's safe. I think an import attribute sounds interesting, but if we had an attribute which skips the loader cache then you'd run into weird situations where if a module self-imports itself it would actually get a namespace object belonging to a different instance of the same module.
@laverdet i believe if you make another module which statically imports your generated modules (even without importing anything from them, just import "foo"
will do), it will never release them even though nothing directly references any of their resources. if that's not the case anymore, its possible there's a path forward here.
@devsnek I'm sorry, it's just not true. v8 does not leak module handles, full stop. I have a great deal of respect for the nodejs team and everything they've done for the community but this condition does not exist in v8. Dynamic loading and unloading of code was just previously not a design goal of nodejs, and that is ok.
My vm
example from above used to leak, but the leak was was node's fault. It was actually just fixed in v20.8.0 by @joyeecheung -- see: https://github.com/nodejs/node/commit/b0ce78a75b & https://github.com/nodejs/node/commit/4e578f8ab1
As for whether or not this was ever a condition in v8 I can go back to 2018 [v8 6.8.275] under Docker and isolated-vm. In this example a single isolate continually compiles, links, and evaluates a graph of 3 modules and stays under 2mb heap size (the array buffers are externally allocated).
Dockerfile:
FROM node:10
RUN npm install isolated-vm@4.3.0
COPY isolated.mjs .
ENTRYPOINT node --experimental-modules isolated.mjs
isolated.mjs
import ivm from "isolated-vm";
const isolate = new ivm.Isolate({ memoryLimit: 128 });
console.log("running", process.arch, process.versions);
for (let ii = 0; ; ++ii) {
const main = isolate.compileModuleSync(
`import { uint16 } from "chunk16";
import { random, uint8 } from "chunk8";
respond.applySync(undefined, [ random, uint8.length, uint16.length ]);`);
const module8 = isolate.compileModuleSync(
`import { uint16 } from "chunk16";
export { uint16 };
// verify that a new module is being run each time
export const random = ${Math.random()};
// allocate and materialize 1mb uint8 array
export const uint8 = new Uint8Array(1024 * 1024);
for (let ii = 0; ii < uint8.length; ii += 4096) {
uint8[ii] = 1;
}`);
const module16 = isolate.compileModuleSync(
`import { uint8 } from "chunk8";
export { uint8 };
// allocate and materialize 2mb uint16 array
export const uint16 = new Uint16Array(1024 * 1024);
for (let ii = 0; ii < uint16.length; ii += 4096) {
uint16[ii] = 1;
}`);
const context = isolate.createContextSync();
context.global.setSync("respond", new ivm.Reference((...args) =>
console.log("observed", ...args)));
main.instantiateSync(context, specifier => {
switch (specifier) {
case "chunk8": return module8;
case "chunk16": return module16;
default: throw new Error();
}
});
main.evaluateSync(context);
// These functions simply release the underlying `Persistent<T>` v8 handles. They're not an exotic
// hack.
context.release();
main.release();
module8.release();
module16.release();
console.log(ii, `${isolate.getHeapStatisticsSync().used_heap_size >> 20}mb`);
}
$ docker build --platform amd64 -t modules-test .
// [...]
$ docker run -t modules-test
(node:8) ExperimentalWarning: The ESM module loader is experimental.
running x64 { http_parser: '2.9.4',
node: '10.24.1',
v8: '6.8.275.32-node.59',
uv: '1.34.2',
zlib: '1.2.11',
brotli: '1.0.7',
ares: '1.15.0',
modules: '64',
nghttp2: '1.41.0',
napi: '7',
openssl: '1.1.1k',
icu: '64.2',
unicode: '12.1',
cldr: '35.1',
tz: '2019c' }
observed 0.6598370276400909 1048576 1048576
0 '1mb'
observed 0.43183456388024477 1048576 1048576
1 '1mb'
observed 0.012569878657472833 1048576 1048576
2 '1mb'
// [...]
observed 0.6746212970768821 1048576 1048576
44808 '1mb'
observed 0.5481371445716845 1048576 1048576
44809 '1mb'
Sorry I should have been clearer. If you want to collect the entire graph it works fine. But the point of hot reloading is generally that you only want to replace specific modules within the graph.
Sorry I should have been clearer. If you want to collect the entire graph it works fine. But the point of hot reloading is generally that you only want to replace specific modules within the graph.
This is what dynohot does actually. Imports are rewritten to point to a single "module controller" [import { symbol } from "./ref";
becomes import controller from "hot:module?specifier=./ref";
]. The controller maintains handles to simulated module instances and prunes them as they go stale. Module bodies are rewritten to a generator function so that they can be reevaluated over and over without leaking data. Without using --expose-internals
only the module code is leaked (not module resources), once per source version (not per evaluation). If you expose internals you can get away without leaking anything at all.
chunk.mjs
export const uint8 = new Uint8Array(1024 * 1024);
export const random = Math.random();
for (let ii = 0; ii < uint8.length; ii += 4096) {
uint8[ii] = 1;
}
setTimeout(() => import.meta.hot.invalidate(), 1);
main.mjs
import { random, uint8 } from "./chunk.mjs";
import { memoryUsage } from "node:process";
process.stdin.resume(); // stay alive
let ii = 0;
import.meta.hot.accept("./chunk.mjs", () => {
console.log(++ii, random, uint8.length, memoryUsage());
});
-> % node --loader dynohot main.mjs
(node:5400) ExperimentalWarning: Custom ESM Loaders is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
1 0.7925818173943733 1048576 {
rss: 101040128,
heapTotal: 7290880,
heapUsed: 5859712,
external: 2777815,
arrayBuffers: 2108070
}
[hot] Loaded 0 new modules, reevaluated 1 existing module in 3ms.
2 0.2771700818620366 1048576 {
rss: 102531072,
heapTotal: 10174464,
heapUsed: 5455080,
external: 2773955,
arrayBuffers: 2107670
}
[hot] Loaded 0 new modules, reevaluated 1 existing module in 2ms.
// [...]
[hot] Loaded 0 new modules, reevaluated 1 existing module in 1ms.
689 0.6166603291700468 1048576 {
rss: 247431168,
heapTotal: 8077312,
heapUsed: 5929784,
external: 70927643,
arrayBuffers: 70265058
}
[hot] Loaded 0 new modules, reevaluated 1 existing module in 1ms.
690 0.15742998982874745 1048576 {
rss: 247087104,
heapTotal: 8077312,
heapUsed: 5005952,
external: 3818779,
arrayBuffers: 3156194
}
// gc just ran, memory back to baseline
Hey all,
Given the astounding amount of effort that folks are putting into realising this, would it be worthwhile, perhaps, to revaluate implementing this as a core Node.js feature?
Pave the cow paths and all that… :)
All the best, Aral
Better to expose it with a warning, than not to expose it and risk breaking lots of npm modules reliant on hacks, and potentially creating vulnerabilities.
If you expose internals you can get away without leaking anything at all.
@laverdet what do you need from Node internals to avoid memory leaks?
@GeoffreyBooth the hack I'm using, and the feature most people are asking for here (a replacement for delete require.cache[key]
), is pretty unsafe. I'm not sure I'd even want to see it in core nodejs. The yield preamble in dynohot (described here https://github.com/braidnetworks/dynohot#transformation) means we only have to leak module source code once per file save, and this is only during development. I haven't run into any issues with the leak even running a server for multiple days while developing. My server always needs to be restarted for some other reason besides memory. I had some more thoughts in the last paragraph here: https://github.com/nodejs/node/issues/49442#issuecomment-1740995593
The closest thing to a long-term solution I've come up with is an import attribute which can only be used on dynamic modules: something like import('./maybe.mjs', { with: { weak: true } })
. This would maintain its own copy of loadCache
so that when the reference to the module is lost the memory is reclaimed. When you get into the implementation details, though, this approach raises many more questions. I think a lot more academic thinking needs to go into this, see: the compartments proposal.
It would also be worth considering a version of my evictModule
hack exported from node:module
with a hostile name of something like unsafe_unstable_evictModuleByResolvedURL
. Such a function would be perpetually marked as 1 - Experimental
in the documentation and should probably unconditionally print a warning to the console. I think experimental power tools like this can be a good stepping stone until Mark Miller completes his life's work and gives us a divine answer to these questions.
👉🏻 In the mean time, if the nodejs team is open to it, I'm happy to submit a PR with the unstable function I described here.
I'm sure a lot of lawless module authors would be eager to unleash a new era of footguns on the community.
Well, please don’t sell it too hard 😄
We don’t want perpetually experimental things. But I don’t know why “evict” would need to be; is it expected to have unavoidable breaking changes frequently? Or were you suggesting the experimental status just because it’s strongly discouraged?
We have a few flags that are already for particular use cases and are strongly discouraged in general, especially for production; --expose-internals
is probably the most prominent of these. We could add another, like --allow-unsafe-module-replacement
or whatever, but I think it would only be worth doing so if the flag provided a meaningfully better UX/DX than what is currently possible. Looking at https://github.com/braidnetworks/dynohot/blob/fb822d2022f9d71d9d7ab5377d5b5d55ddcb26a8/runtime/utility.ts#L62-L83, though, I don’t think this is avoiding the memory leak; loadCache
is used for spec compliance when two import()
expressions have the same resolution and the underlying resource has been deleted between loads, and per spec it needs to continue to load successfully every time. There’s another cache within V8 for all the ES modules that have ever been loaded and evaluated, and as far as I know there’s still no API for removing or updating those. That’s the API we’ve been waiting for for years, that would truly solve this issue. (And if it’s finally been added, please let us know and provide a link; or feel free to work with the V8 team and submit a PR.)
There’s another cache within V8 for all the ES modules that have ever been loaded and evaluated, and as far as I know there’s still no API for removing or updating those.
This is a common misconception but it is not true and I don't believe it has ever been true. I've refuted it using isolated-vm here https://github.com/nodejs/node/issues/49442#issuecomment-1741839325 and here https://github.com/nodejs/loaders/issues/157#issuecomment-1687044349. Module records in v8 are no more special than a plain object. Since the fix in nodejs v20.8.0 you can also prove this using SourceTextModule
of node:vm
.
Or were you suggesting the experimental status just because it’s strongly discouraged?
Yes, I would discourage its use. require.register
and require.cache
contributed to the total anarchy we've seen under CommonJS that the ecosystem is still feeling to this day. The evict function explicitly breaks the invariants promised under the specification. I'm clearly ok with reality-altering features (see fibers) but I do want to make sure I communicate the nature of what I've proposed.
This is a common misconception but it is not true and I don’t believe it has ever been true.
Then are you using such an API, and if not, why not? Wouldn’t replacing modules loaded into V8 solve memory leaks?
Then are you using such an API, and if not, why not? Wouldn’t replacing modules loaded into V8 solve memory leaks?
I'm not sure I understand. There is no v8 API for this. Simply, once there are no more handles to a module it is garbage collected, in the same way a JSON object or any other value would be garbage collected. Since the module is in node's loadCache
forever it can never be collected. There's no replacing a module, in the same way you can't modify the source code of a function after it's been created.
We have an implementation of the "hack" (with the leak) in https://github.com/platformatic/platformatic/blob/main/packages/runtime/lib/loader.mjs. Works great.
Note that we don't do full HMR. Basically we reload the app, and we have special code to not close the HTTP server and live restart a Fastify application. This is mostly side effect free.
Note that quite a lot of people are working towards a world where this is possible. My NodeContext PR stalled because of memory leaks in instantiating Node.js core objects, but the team is busy fixing those.
If the leak is only caused by loadCache
/ “module map”, then I think it should be fine to have an API like import { clearCache } from 'node:module'
that allows you to delete a particular entry (clearCache(moduleAbsoluteURL)
or even all entries; that’s just like clearing the cache in a browser. I think the only consequence would be that a subsequent import
might produce a different result rather than getting the previously cached result, but that’s also similar to a browser cache having been cleared. I know the spec says that subsequent imports must return the same result, but I assume that that means “in general usage,” not if the user has specifically instructed the runtime to not return a cached result.
that’s just like clearing the cache in a browser
It's not a good comparison, since the browser cache doesn't affect correctness. I mentioned that here, that I think "cache" is a poor choice of name for this since it is a matter of correctness and not performance.
I think the only consequence would be that a subsequent import might produce a different result rather than getting the previously cached result, but that’s also similar to a browser cache having been cleared.
Clearing the browser cache of a running web page will not affect the result of import()
. It has no observable effect, except on timing.
I know the spec says that subsequent imports must return the same result, but I assume that that means “in general usage,” not if the user has specifically instructed the runtime to not return a cached result.
The language in the specification is clear, and our proposed function is a violation of the invariants expected of the host. As a matter of personal style I am totally ok violating any rule for any reason, but I do want to make sure we're on the same page that this is a dangerous and powerful violation of the specification.
I do want to make sure we're on the same page that this is a dangerous and powerful violation of the specification.
I mean, the while point of the module customization hooks is to allow the user to violate spec however they please (that Node can achieve). It's not like importing CoffeeScript is spec compliant. If we're worried about dependencies using this API to mischievous ends we could gate it behind a flag or create a permission for it; but I'm not sure what the risk is.
It's not like importing CoffeeScript is spec compliant.
It is, though. CoffeeScript would be a Cyclic Module Record (a module which participates in the specified cyclic resolution algorithm). Cyclic Module Record is described in the specification as "abstract".
Source Text Module Record, which is the ES Modules that we all know, is a concrete implementation of that abstract interface. So a CoffeeScript module would be a separate but valid implementation of a Cyclic Module Record.
If you go up one level there is a plain Module Record, from which Cyclic Module Record is implemented. This provides the means for modules which don't participate in the cyclic resolution algorithm (wasm, json, or the whole node:
scheme). Neither CoffeeScript, WASM, or TypeScript are specified in ES262 but they are still spec compliant.
Anyway, you are right that the loaders API provides the means to break specification, since the result of resolve
is neither pure nor is it memoized. This breaks the invariants of HostLoadImportedModule
: "If this operation is called multiple times with the same (referrer, specifier) pair [then it must resolve] with the same result each time". I suppose the difference here is that we are proposing a globally-importable utility which could be used outside of a loader.
but I'm not sure what the risk is
The risks are impossible to enumerate since we're reneging on an presumed invariant. Like what happens if a user attempts to evict node:fs
? Idk, will it segfault?
Anyway the risks are probably mostly hypothetical in nature. I'll try and open a PR soon to continue the discussion.
I'm abandoning the PR at #50618. After more reflection I think this is going to harm the ecosystem more than it will do any good. From what I can tell there are 3-4 different use-cases mentioned in this issue which can be solved in other ways:
Q: "How do I retry a module whose file content was created after a failed import"
A: Just reimport it with a cache bust: await import("failed-module?retry=1")
Q: "How do I reload a module and all its dependencies" A: Use a loader which carries forward the cache bust: https://github.com/nodejs/node/pull/50618#issuecomment-1894603753
Q: "How do I add module support to unit test frameworks"
A: Use vm.SourceTextModule
Q: "How do I hot reload modules during development" A: Use dynohot
@laverdet Does this look fine to you (this is a reworked version of https://github.com/nodejs/node/pull/50618#issuecomment-1894603753)?
@pygy The loader code looks correct and does what it says it does.
import * as mDev from './my-module.js?dev'
process.env.NODE_ENV='production'
import * as mProd from './my-module.js?prod'
This counter-example from the documentation doesn't make sense though. Imports are not executed in an imperative manner, so the body of ./my-module.js?prod
will actually run before process.env.NODE_ENV='production'
is evaluated. In fact you can't even guarantee that ./my-module.js?dev
will run before ./my-module.js?prod
without also looking at the rest of the module graph.
Good catch, thanks this should have been dynamic imports.
Fixed, and I added an example with dependencies for extra clarity:
Suppose these files:
// foo.js
export {x} from "./bar.js"
// bar.js
export const x = {}
We can then do
import "esm-reload"
const foo1 = await import("./foo.js?instance=1")
const bar1 = await import("./bar.js?instance=1")
const foo2 = await import("./foo.js?instance=2")
const bar2 = await import("./bar.js?instance=2")
assert.equal(foo1.x, bar1.x)
assert.equal(foo2.x, bar1.x)
assert.notEqual(bar1.x, bar2.x)
Edit again: https://www.npmjs.com/package/esm-reload
There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the https://github.com/nodejs/node/labels/never-stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment. For more information on how the project manages feature requests, please consult the feature request management document.
Should probably close this as "wont do"?
My example
import fs from 'node:fs';
import { isBuiltin } from 'node:module';
import path from 'node:path';
import { fileURLToPath } from 'node:url';
import {
createContext,
type Module,
type ModuleLinker,
SourceTextModule,
SyntheticModule
} from 'node:vm';
const ROOT_MODULE = '__root_module__';
const link: ModuleLinker = async (specifier: string, referrer: Module) => {
// Node.js native module
const isNative = isBuiltin(specifier);
// node_modules
const isNodeModules =
!isNative && !specifier.startsWith('./') && !specifier.startsWith('/');
if (isNative || isNodeModules) {
const nodeModule = await import(specifier);
const keys = Object.keys(nodeModule);
const module = new SyntheticModule(
keys,
function () {
keys.forEach((key) => {
this.setExport(key, nodeModule[key]);
});
},
{
identifier: specifier,
context: referrer.context
}
);
await module.link(link);
await module.evaluate();
return module;
} else {
const dir =
referrer.identifier === ROOT_MODULE
? import.meta.dirname
: path.dirname(referrer.identifier);
const filename = path.resolve(dir, specifier);
const text = fs.readFileSync(filename, 'utf-8');
const module = new SourceTextModule(text, {
initializeImportMeta,
identifier: specifier,
context: referrer.context,
// @ts-expect-error
importModuleDynamically: link
});
await module.link(link);
await module.evaluate();
return module;
}
};
export async function importEsm(identifier: string): Promise<any> {
const context = createContext({
console,
process,
[ROOT_MODULE]: {}
});
const module = new SourceTextModule(
`import * as root from '${identifier}';
${ROOT_MODULE} = root;`,
{
identifier: ROOT_MODULE,
context
}
);
await module.link(link);
await module.evaluate();
return context[ROOT_MODULE];
}
function initializeImportMeta(meta: ImportMeta, module: SourceTextModule) {
meta.filename = import.meta.resolve(module.identifier, import.meta.url);
meta.dirname = path.dirname(meta.filename);
meta.resolve = import.meta.resolve;
meta.url = fileURLToPath(meta.filename);
}
Use it
const module = await importEsm('filename');
If you are interested, we created Hot Hook
to hot reload node imports during development.
https://adonisjs.com/blog/hmr-in-adonisjs https://docs.adonisjs.com/guides/concepts/hot-module-replacement https://github.com/julien-R44/hot-hook
How do I invalidate the cache of
import
?I have a function that installs missing modules when an import fails, but the
import
statement seems to preserve the failure while the script is still running.The only information I found in regards to import caching was this documentation, which does not tell me where the “separate cache” used by
import
can be found.