Open jonathantneal opened 5 years ago
The import map is just a specific resolver implementation, so no different to the existing hooks we already have.
@guybedford currently it is unsafe to populate the same URL twice in V8 in the same context per the issue linked above, it segfaults usually
Right, it sounds like a fix for that is a good first step then indeed.
@guybedford per the issue above, it has an existing changeset that fixes it but it is no longer assigned and several attempts to bump it have been made in various locations. Even if we do change it, GC currently is completely disabled. Per Snowpack, the workflows being looked at are around signaling and manual alteration via import.meta.hot not around exposing the Map of URL=>instance, also URL=>instance isn't stable as asserts are and other things might be additional parts of the cache key. Exposing the cache key is likely unstable enough to not want to do so.
around signaling and manual alteration via import.meta.hot
Will this support named exports mutations? How will export * be handled? Doesn't this just get into the same problems of dynamic modules?
Working with v8 to make the cache key stable is important, yes including the GC issues. Node.js and Deno definitely are the drivers of this work. If v8 fork / custom patches are needed then even that makes sense, as having control of the module system is important to a JS platform!
Will this support named exports mutations? How will export * be handled? Doesn't this just get into the same problems of dynamic modules?
No, they completely reload with new absolute cache keys in their implementation. export *
causes strong linkage that requires that whole subsection of the graph to reload if it reloads. Per https://github.com/snowpackjs/esm-hmr the idea is to manually replace locals for the boundary locations of the replacement. I had made some slides on this just to visualize their approach after that call with TC39, @JoviDeCroock likely could speak better on details than I.
Two-pronged approach seems fine yes. But then full reload scenario exactly relies on what is being discussed here - from v8 bugs to the Node.js API per the last comments.
@guybedford what prevents full reload from following the same workflow? If the entire graph is strongly connected it would still work I believe.
@bmeck not sure I understand, do you mean always relying on the import.meta.hot
changing local specifiers?
@guybedford yes, and if the entire graph is reloaded (full reload) there isn't a need for changing local bindings.
@bmeck right, but you don't want to refresh the entire graph is the point. I think we do need the ability to refresh subgraphs as having the options being full reload only (as in, restarting Node.js) or local bindings only seems arbitrarily restricting.
I scanned and haven't seen this mentioned anywhere yet - given that the cache can be bypassed by passing a querystring parameter:
let c = 0;
function importFresh(mod) {
return import(`${mod}?v=${++c}`);
}
... couldn't the issue here be reframed to "it is not possible to update a module's exports after it has been imported"?
A workaround like this is what I've been noodling on:
// convert exports to non-const bindings
export var Foo = class Foo {};
export var bar = 42;
import.meta.hot.accept(async ({ module }) => {
/* These don't work: */
// Object.assign(await import(import.meta.url), module);
// Object.defineProperties(await import(import.meta.url), Object.getOwnPropertyDescriptors(module));
({ Foo, bar } = module);
});
Well the exports themselves alone aren't sufficient, imagine a scenario where the following happens:
const x = 8;
export const getFoo = () => 1 + x;
Changing the getFoo
isn't sufficient for this scenario, we would need to replace the entire module in this case.
Replacing the export to point to the new module instance should be sufficient - getFoo
becomes a reference to the new module's exported getFoo
, which has its own copy of x
. For preserving/modifying state, that seems like a concern that would be external to module cache invalidation.
There is a case where this is fully broken though, which is when exports are added in an updated version of a module that were not previously defined.
couldn’t the issue here be reframed to “it is not possible to update a module’s exports after it has been imported”?
There’s a loader built by @giltayar around this principle, that appends unique timestamps to specifiers so that psuedo-HMR can work. The issue is that the no-longer-needed earlier versions of such modules are never removed from memory, and so over the course of hours while you’re working (and hot-reloading the same module over and over and over) Node will eventually run out of memory and crash.
The issue is that the no-longer-needed earlier versions of such modules are never removed from memory, and so over the course of hours while you’re working (and hot-reloading the same module over and over and over) Node will eventually run out of memory and crash.
We can't really make an assumption that HMR won't leak. Even in CJS it leaks.
We can't really make an assumption that HMR won't leak.
Fair, but at least that's accidental. The query string approach guarantees eventually running out of memory.
quick one about the whole graph:
The hard part is not exposing the private module wrap interface and wanting to provide dependency graph metadata for clearing ancestors.
to have at least parity with CJS, when I delete require.cache[require.resolve('../path')]
in CJS, it doesn't invalidate its required modules, neither relative paths nor installed.
accordingly, I don't think invalidating the whole subtree is needed, or even desired, or at least I could deal with the cache exposed the way CJS does.
quoting from @developit's suggestion:
couldn't the issue here be reframed to "it is not possible to update a module's exports after it has been imported"?
While I think it's a valid point to ask for simplification of the matter, I still disagree with doing so.
I think the point here is that node users are expecting a cache dictionary at require.cache
that can simply be deleted by using the delete
keyword, so unsurprisingly they do expect the same functionality when the module system is upgraded (esm import
).
Disregarding all the discussions around standards etc., the most reasonable way of fixing the issue to me would hence be to introduce a cache dictionary on import
and allow the user to delete entries. Surely there has already been put a lot of thought into require.cache
when it was implemented in node.
While adding a query string may be a valid workaround to re-import modules, it is logically something different than clearing a specific part of a cache. As others have noted, it can additionally lead to memory leaks.
IMO, any functionality that goes beyond that (e.g. fancy hot module replacement) is a specific technical vision and should be handled separately.
introduce a cache dictionary on
import
to and allow the user to delete entries
strong agree with you, but require.cache
was likely born in times Map
was not a thing, so that I personally wouldn't mind if the import
cache is exposed as map, just to have it aligned with the more-modern JS it's representing.
import.meta.cache.delete('../path.js');
fwiw require.cache
was cited during the design of ESM as one of the motivations for an immutable cache. @TimDaub require.cache
was more or less added at the whim of one of the early designers of node. I'm also curious if your specific use case is HMR.
If digging up past arguments here, then it's worth noting that a mutable registry
map was always an original design goal for ESM loaders - https://whatwg.github.io/loader/#registry-constructor.
@guybedford correct, but that was not pursued to completion due to various issues and some of those original participants are on the calls mentioned above.
So, there's no way to implement hot reloading during development while using ESM modules? 😥
I tried adding a query parameter with random string to workaround the cache, but I can do it for multiple files, I can only do for the index file, since I'm trying to build a library that provides hot module reloading. Any help would be greatly appreciated 🙂
@vasanthdeveloper you can look at https://github.com/snowpackjs/esm-hmr and/or https://dev.to/giltayar/mock-all-you-want-supporting-es-modules-in-the-testdouble-js-mocking-library-3gh1 for different approaches to do this. Note that loaders remain unstable though.
We have gotten esm-hmr to work in the browser all though this is a pretty "unconventional" approach in the sense that the original Module will actually stay in the browser.
As you can see in esm-hmr on first serve we'll attempt at creating a moduleGraph
Map which lists a module with it's dependencies, dependents and whether or not it has a module.hot.accept
in the code.
When an update happens to a module that accepts updates we'll fetch said module appended with a ?mtime=x
query parameter, this means that at this point we'll get the new module in-memory but we can't just inject it into the currently in-use module, so frameworks currently write code to hot-replace these. Prefresh being one of these. This code will have logic for a specific framework, in this case Preact, to hot-replace a Component.
When a child updates that doesn't accept its own updates we bubble up and treat parents that do accept updates as boundaries for updating these children. When we encounter such a boundary we'll have to deeply rewrite the imports for the path leading up to said child with the same technique of ?mtime=x
.
The tricky part presents itself when subsequent updates happen as explained here
When we update Counter.jsx, the child of Internal.jsx we do the following implicitly import Counter.jsx?mtime=Date.now() this will make our babel-functions re-run and register the new type for Counter.jsx. For in place updates this works great, but when we now update Internal.jsx we'll be importing the old variant of Counter.jsx since during esm-hmr we have no way to update this ModuleRecord in place, essentially the Counter.jsx?mtime=Date.now() is orphaned and disposed instantly.
internal.jsx updates with the old reference, without checking for a new one to exist (byte-cache), this makes it render with the initial code rather than the new.
This to sum-up the current less than ideal approach we are taking in the browser to circumvent this caching issue, I know this is the nodejs
repository but could be a useful bit of information.
I bypassed ESM caching by making my own loader which appends a random string as a query at the end of each file being imported. Here's the code I made for testing 👇
import { URL } from 'url'
export const resolve = async (specifier, context, defaultResolve) => {
const result = defaultResolve(specifier, context, defaultResolve)
const child = new URL(result.url)
if (
child.protocol === 'nodejs:' ||
child.protocol === 'node:' ||
child.pathname.includes('/node_modules/')
) {
return result
}
return {
url: child.href + '?id=' + Math.random().toString(36).substring(3),
}
}
To see if this leaks memory, I made a chain of JS files that import, and then continuously changed their contents 1000+ times, and monitored the RAM usage of node
and it seems all normal to me. 🤷♂️ ESM cache seems to be cleared automatically as soon as no other file imports that particular file.
Before startingnode
, I add my loader using the following command 👇
node --no-warnings --experimental-loader ./hot.js src/index.js
Thanks to @bmeck for guiding me 😅
@vasanthdeveloper module management is bytes not megabytes :) They are certainly not cleared. Try allocating a 1MB string in each module. But yes practically we may be able to go far on just not dealing with GC problems, untill that hits a wall of course.
I'm gonna try adding 1MB string in those JavaScript files. But since it's only for development, I think it's tolerable.
Also using the query string method for the in-progress ESM version of JavaScript Database (JSDB), and it’s not tenable due to the memory leak (my append-only JavaScript data files can be hundres of megabytes in size).
I’d love to see Node pave the cowpaths on this and support the query-string method of ESM cache-busting by garbage collecting the previous version of the module when it detects the practice. (I haven’t peeked into the code so this is probably way easier said than done.)
@aral V8 won't GC the module once it is linked, please note that Node does not modify V8 generally and so the issue of allowing GC of modules is likely better on their issue tracker.
Can we raise the issue in V8 to provide C++ API for hot reloading modules?
wt., 23 lut 2021 o 14:58 Bradley Farias notifications@github.com napisał(a):
@aral https://github.com/aral V8 won't GC the module once it is linked, please note that Node does not modify V8 generally and so the issue of allowing GC of modules is likely better on their issue tracker.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nodejs/node/issues/49442, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNWS3ZAQ4WJCYGZD4XV65LTAOYA7ANCNFSM4HD5LYQA .
@giltayar have you looked into using Workers or other solutions to have a module cache that you can destroy (such as by killing the Worker)?
+100000000 for using Workers. I created a small worker script that received the module path and some arguments (as an object) for the module in the workerData object. The worker uses a dynamic import to load the module and execute it using the arguments. It then just returns the result to the parent. Since the worker executes in it's own module context there is no need to invalidate cache. Works perfectly. ❤️
Node.js 14.x lts
import { Worker } from 'worker_threads'
const worker = new Worker('./module_executor.js', {
workerData: { modulePath: './path/to/module.js', args: { arg1: 1, arg2: 2 } }
})
worker.once('message', result => {
console.log(result)
})
worker.once('error', error => {
console.error(error)
})
import { workerData, parentPort } from 'worker_threads'
async function executeModule({ modulePath, args }) {
const { default: mod } = await import(modulePath).then(module => module.default)
// const { mod } = await import(modulePath).then(module => module.default) // depending on how you exported
let result;
if (mod.constructor.name === 'AsyncFunction') {
result = await mod(args)
} else {
result = mod(args)
}
parentPort.postMessage(result)
}
executeModule(workerData)
@vsnthdev No idea why your solution uses so little memory... but it actually works!
PS: I'm currently in the process of migrating lambda-tdd and node-tdd from require
to import
.
@simlu Thank you 😊
But please refrain from using this in production.
I used to use this technique to primarily have HMR support during development 🙂
@vsnthdev Oh I would never. Just for lots and lots of test suites =)
Still only invalidates the cache for the imported module, not for the indirect dependencies. So still no switching to ESM 😟.
@vsnthdev I've used your method to implement cache blasting in mocha --watch
mode for ESM imports, and it works!
https://github.com/mochajs/mocha/issues/4374#issuecomment-1073140194
@vsnthdev I've used your method to implement cache blasting in
mocha --watch
mode for ESM imports, and it works!
I am glad it helped you 😊
This might just be the dumbest thing to do (and will probably do bad things to node's cache system), but you can add random parameters to your imports to get an uncached version
const myModule = await import(`./myFile?cachebust=${Date.now()}`)
Needless to say that this isn't for a prod system
Here is the hot reload file we use to achieve the balance between cache invalidation and cache reuse to be able to run hundrets of tests without hitting memory limits but also invalidating the necessary files.
We invalidate by environment variables and by comment. This requires to add strategic comments to the code to invalidate the necessary files.
For inspiration or for use as is. This took some fine tuning. Do not use in production
Ah yes the reason why I wondering how @remix-run 's dev mode had memory leak, turns out its because of cache busting... Btw may I know the reason why cache removal is not coming to nodejs' esm? I cant even differentiate whether nodejs focuses on browser spec like deno, or server side stuff... Why talking about deno here? Because they have import mappings, and here we cant even get unloading cache, like its not that dangerous? Pretty sure its going to be used in dev mode and not in production. Like come on, dont tell me hackers will hack via this method! There are literally a lot of ways to easily hack a pc if hackers can get their hands on them!
Cant we just keep it simple as it was before? Workers need a lot more lines + complexity unlike deleting require cache 😄
@renhiyama essentially, v8 is in control of the module cache & node has no way to interact with it. i think there was some discussion around getting v8 to add an api, but it's quite an uphill trek & it's not likely to be undertaken vigorously when cache-busting can be achieved in userland via the workaround provided above.
@zackschuster but cache busting creates a memory leak since the old ones are not wiped out unless using workers? (or maybe even using workers, I didnt test workers till now since it looks complex than clearing require cache)
Does the experimental ESM Loader Hooks API, released this week in NodeJS 18.60, provide any new avenues for solving this problem? https://github.com/nodejs/node/releases/tag/v18.6.0 https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o
but cache busting creates a memory leak since the old ones are not wiped out
hot reloading in a browser has the same issue, incidentally.
unless using workers? (or maybe even using workers, I didnt test workers till now since it looks complex than clearing require cache)
i think workers have ties to the FinalizationRegistry
that help ensure they get cleaned up, but don't quote me on that.
I like how chrome doesn't even try a single thing to lower the ram usage, not even providing alternative methods so memory leaks doesnt happen...
This cowpath should be paved. The original intent of shipping a Module Registry for EcmaScript Modules was a good one, that recognized a valid need very very widely faced. Trying to un-ship this capability denies us what we need. Give the users what they want.
Does the experimental ESM Loader Hooks API, released this week in NodeJS 18.60, provide any new avenues for solving this problem? https://github.com/nodejs/node/releases/tag/v18.6.0 https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o
Vsnthdev's's workaround seemingly uses the programmable esm loader hooks to redirect every request to a new uniquely named request, which seemingly works ok.
Does the experimental ESM Loader Hooks API, released this week in NodeJS 18.60, provide any new avenues for solving this problem? https://github.com/nodejs/node/releases/tag/v18.6.0 https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o
Nope, sadly not.
Also, if the officially-linked article (see end of https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o) for how to do ESM cache invalidation with Node.js (https://dev.to/giltayar/mock-all-you-want-supporting-es-modules-in-the-testdouble-js-mocking-library-3gh1) includes (count them) four “sneaky!” exclamations while describing the method, that’s a smell.
Not mentioned in that article: that the method leaks memory.
Sneaky!
How do I invalidate the cache of
import
?I have a function that installs missing modules when an import fails, but the
import
statement seems to preserve the failure while the script is still running.The only information I found in regards to import caching was this documentation, which does not tell me where the “separate cache” used by
import
can be found.