nodejs / tooling

Advancing Node.js as a framework for writing great tools
167 stars 17 forks source link

ESM module reloading and module graph #51

Open boneskull opened 4 years ago

boneskull commented 4 years ago

In a nutshell, we need to be able to "unload" an ESM module, which is not supported. Tooling needs this for reloading files (e.g. in a tool's "watch" mode) or mocking modules (using proxyquire, rewiremock, etc.)

Ref: nodejs/node#49442

coreyfarrell commented 4 years ago

To refine what I was suggesting in the meeting maybe we could create a standard but flagged way to do this (TC39 proposal?). For example import.unload('specifier') would be unavailable by default but running node.js or the browser with a flag could expose the unload API. The assumption being that webdriver and/or puppeteer could use the flag when initializing the browser for test purposes.

theKashey commented 4 years ago

I should comment here months ago, but it's never too late. HMR is supported by all major bundlers, a bit defferently in every case, so there is no real "standard".

So which parts are needed for HotModuleReplacements:


As a result, the main issue to resolve is actually how to "accept" and "reconcile" the update.

I am thinking - it is possible to have an explicit hot import - import stuff from 'hot:./module'

However, this is not letting to implement some custom update logic, which might be needed, and supported by bundlers... and which I never seen used.

boneskull commented 4 years ago

to be clear, I think what we're talking about is not the browser, or how a bundler must work around the limitations, but rather what Node.js can add (beyond the ES spec) to support what tools running in Node.js need and expect from require. "what we need" is what I'm trying to hammer down.

if Node.js declines to support these use cases--I don't think it will, but if it did--it would place a significant barrier to using non-transpiled ESM in Node.js code. unfortunately, it seems like it's not a barrier that devs see until they've already started driving forward using native ESM in Node apps

boneskull commented 4 years ago

Per our discussion, this is what's needed:

  1. A module needs to know which module imported it.
    • If module a imports module b, we should be able to query an API with module b and see that module a is its "parent".
    • Module b's "parent" should not be "the module that imported module b first", which is essentially how Node.js' CJS system works.
  2. We need to be able to prune (unload, remove from cache) an entire subtree of modules, with an optional target (a module "in the direction" of another module).
    • For example, module a imports module b, which imports module c and d. Module c also imports module d. We should be able, starting with c, to unload d (and its children), but not unload the d that was required by module b.
    • @theKashey I'm not sure I explained this right.
ljharb commented 4 years ago

Modules (CJS or ESM) don't have parents in that way, because they're evaluated only once. In other words, the entire concept of "parent" is wrong and flawed, CJS never should have had it, and ESM shouldn't get it. What are the use cases that think they need it?

boneskull commented 4 years ago

I'll let @theKashey explain.

boneskull commented 4 years ago

(if he doesn't want to: in a nutshell, I think it has to do with being able to mock one "use" of a module, but not another)

theKashey commented 4 years ago

Frankly speaking, parent is needed only for speed optimization - as a parent is aware of it's children - children should know all parents.

The child->parent information could be build from the existing parent->child one, but it takes time, and needed for almost every graph operation. Like if you want to evict a module - you should delete a reference to from the module it uses, as long as it would or lead to 1) memory leak, or to 2) two different module graphs coexisting in one time.

So parent information is needed to maintain module graph integrity in a constant time, right now it requires a full traverse of existing graph to reconstruct child->parent connection, and still let's you create an isolated "areas", where a child has a reference to a deleted parent, or a parent has a reference to a deleted child.

theKashey commented 4 years ago

I think it has to do with being able to mock one "use" of a module, but not another

This is also a good case. The golden rule of "mock only what you own", or direct dependencies. However, it does not require access to the module cache, as it should occur on, let call it, "import time". In cjs terms - after require, but before Module._load; and the result should not be stored in the module cache at all.

ljharb commented 4 years ago

Right, but you'd do that with import maps or something like it - not at runtime in the module that's being imported.

I certainly see a use case for listing all parents of a module - ie, all importers - but that information would need to be statically determinable prior to the module's evaluation, and would thus not include dynamic importers, only static ones. I'm still unclear about the usefulness of that at runtime (as opposed to the imo more typical approach that would generate a dependency graph in advance, and then use it at runtime).

theKashey commented 4 years ago

In "runtime" it this information might be not so useful, as long as HMR is not expected to be an "often" event, so it's ok to use any other "slow" path. However, in tests it's better to be faster.

that information would need to be statically determinable

But not always.

const module = import(`./dynamic-module-${letter}`)

From my point of views importers might be statically determined on a build time, but not all those "detected" importers would actually "import" the module at some point of time. So their list is non-static by design. (and thus is it possible to skip other "static" requirements?)

ljharb commented 4 years ago

@theKashey and if that module was previously loaded, it would never be evaluated again, so any "parent" it was able to see would have been the first importer, not the import() callsite you're interested in.

It simply makes no sense for a module to evaluate with any knowledge of its own consumers.

theKashey commented 4 years ago

Firstly we are not talking about evaluation, but something after it - already established module graph. 👉Module reloading can happen only after the loading. Secondary in a few rare cases (it's more about old code patterns) you still need to know who is your parent - for example proxyquire uses it to resolve module paths relatively to the parent - aka Module._resolveFilename(file, ->parent<-). This is also a reason why proxyquire uses a self-eviction - it should work a bit different for the every invocation.

ljharb commented 4 years ago

If it's from an already established module graph, then it makes total sense to expose the graph in some way! However, I don't think it should be called a "parent" since module graphs are not parent-child relationships.

theKashey commented 4 years ago

We probably just went to deep and talking about details we don't own, and so don't need. There is no need to give us an access to some standard module system - it at least require throughful development of that structure, and will concrete some realisation. In normal development we have a clear separation between public API and internal realisation, and actually the absence of public API for module graph - is the real problem.

All we need is:

A week ago I've found that the majority of cache-delete operations(99.9%) actually leading to the memory leak. Ie deleting a module leaves reference to it in a parent, or a children; keeping references to children in the module as well. And the root cause - actually undocumented, and not expected to be "hacked", module graph structure.

Ayfri commented 4 years ago

Nothing has change with Node.js 14 ?

bcoe commented 3 years ago

@searls 👋 we've had this issue open for quite some time, that we'd like to make sure that it's possible to reload ESM modules in Node (to provide features like proxying.)

I'd noticed that quibble supports ESM modules. Are you finding that the platform already provides all the functionality that you need?

searls commented 3 years ago

@searls 👋 we've had this issue open for quite some time, that we'd like to make sure that it's possible to reload ESM modules in Node (to provide features like proxying.)

I'd noticed that quibble supports ESM modules. Are you finding that the platform already provides all the functionality that you need?

Thanks for reaching out! @giltayar is our resident ESM expert—what say you?

giltayar commented 3 years ago

You can find all the sordid technical details here: https://dev.to/giltayar/mock-all-you-want-supporting-es-modules-in-the-testdouble-js-mocking-library-3gh1

The TL;DR is this: while you cannot unload a module, you can use loaders to ensure that each time you want, a new version of the module will be loaded on import. For all practical purposes, this is the same as deleting the module from the cache, except for the fact that memory for the old version is not removed. Given that this is for development purposes, I don't believe this to be a major concern.

Note that using a loader means adding --loader <loader> to the command line. Which is fine, but if you want two loaders (e.g. testdouble and mocha-watch together), then you're out of luck, as multiple loaders has not yet been implemented, let alone defined. So mocking and watching together won't work.

Ayfri commented 3 years ago

Also you can use the same method as Deno, adding a #number that indicates that that is a new iteration of the module ?

giltayar commented 3 years ago

@Ayfri I use a query parameter, but it's the same method.

Where does demo use this?

Ayfri commented 3 years ago

https://github.com/denoland/deno/issues/6946

hpx7 commented 3 years ago

I'm still trying to figure out how to reload a module anytime it or any of its dependencies are modified.

The closest I've been able to get is this:

let mod = await import("./mod.js");
fs.watch(require.resolve("./mod.js"), async () => {
  mod = await import(`./mod.js#${Math.random()}`);
  console.log("reloaded", mod);
});

While that correctly reloads the module when it is directly modified, it doesn't reload when a dependency of the module is modified. For example, if mod.js imports from ./foo.js, I want to reload mod.js if foo.js is modified. Is there some way to accomplish this with the current tooling?

EDIT: I'm already using the ts-node/esm loader so I can't use another one