Loader chaining *while loading loaders*

arcanis commented 2 years ago

I didn't see this situation described in the doc: what's the plan for when loaders need to leverage the previous ones to setup the next ones? For example, take this command line:

node --loader pnp --loader ts-node foo.ts

The pnp loader is required to loader the dependencies from the right place. Not only this is true for the imports of foo.ts, but this is true for the ts-node loader itself, since it's a dependency as any other (Node wouldn't know where to find ts-node if the pnp loader wasn't available).

I think this is relatively easily solved, by stating that the ambient loader is augmented as loaders are loaded, not just once at the end, but I just wanted to check this had been formalized somewhere.

GeoffreyBooth commented 2 years ago

I think in the current design, all the loader hooks happen in a scope where no loaders exist. So like if the ts-node hooks include import ts from 'typescript' at the top, typescript will be resolved per Node’s default resolution and not via any custom logic that another loader may have added. I’m not sure if it’s easy or even feasible to behave otherwise; @JakobJingleheimer or @bmeck ?

JakobJingleheimer commented 2 years ago

all the loader hooks happen in a scope where no loaders exist.

Yes, I'm ~99% sure that is how it currently works: node-land's esmLoader has only default* in its collections of hooks, so that's all that will get called.

If hooks themselves needed other hooks to be involved in their own imports, those would need to somehow get added to node-land's esmLoader (which is not currently possible) here:

https://github.com/nodejs/node/blob/c3866b09c1898c5397acf0dc0026839c5a2a5809/lib/internal/process/esm_loader.js#L68-L74

I'm not sure if this would be a good idea though as it could lead to a dead- or live-lock.

bmeck commented 2 years ago

This also could be solved by giving a direct reference to the other loader in the chain.

JakobJingleheimer commented 2 years ago

Also that.

I almost said, but didn't we (you?) say at one point that was a baaad idea? (When I'm remembering the convo, "baaad" is in your voice)

bmeck commented 2 years ago

@JakobJingleheimer the only problem ideas are ones that break out of expected workflows. Giving the next loader to delegate to is just a simple Dependency Injection style workflow. My concern and great fear is when people suggest giving the "default" loader as it escapes all sorts of mechanisms and workflows. Imagine the following code:

// sanitizes user input and mark it as sanitized
const toSanitized = new Map();
function defaultSanitize(input) {
  let result; // ... do something using input age and address ...
  toSanitized.set(input, result);
  return result;
}
function customSanitizePhysicalAddresses(input) {
  // ...
}
function customSanitizeAges(input) {
  // ...
}

Imagine these are chained in some way that we specify that both addresses and ages need to be sanitized.

If we alter these functions customSanitizePhysicalAddresses and customSanitizeAges and give them access to defaultSanitize and short circuiting, then it is trivial to accidentally or intentionally skip one of these even if we configured it to do both, even worse in this example there is a cached result that could be improper. That is the problem. So you don't want to add a reference directly to defaultSanitize. You could add a reference to the next one in the chain though since it wouldn't allow escaping a workflow.

Adding as a parameter like right now is clunky and confusing, but no we can certainly give a direct reference to hooks as long as they are not direct / allow proper workflow to not leak around.

arcanis commented 2 years ago

This also could be solved by giving a direct reference to the other loader in the chain.

I was thinking the other way around: each loader could be passed an indirection to "the next loader in the chain". While evaluating loaders, calling these functions would forward to the default implementations, because there aren't next loaders yet. Then, as they get evaluated, they are replaced by the true next one.

Something akin to this, in pseudo-code:

const loaders = [];

function addLoader(name) {
  const index = loaders.length;
  const getNextLoader = () => loaders[index + 1] ?? defaultLoader;

  loaders.push(importLoader(name, {
    resolve: (...args) => getNextLoader().resolve(...args),
    load: (...args) => getNextLoader().load(...args),
  }));
}

for (const name of ['pnp', 'ts-node', 'coffeescript']) {
  addLoader(name);
}

Would that make sense?

ljharb commented 2 years ago

I'm confused, why would you want the next loader instead of the previous one?

arcanis commented 2 years ago

That's what the design document states:

The hook functions nest: each one must always return a plain object, and the chaining happens as a result of each function calling next(), which is a reference to the subsequent loader’s hook.

Although it's a little unclear since another line suggests a right-to-left evaluation:

These would be called in the following sequence: cache-buster calls http-to-https, which calls unpkg. Or in JavaScript terms, cacheBuster(httpToHttps(unpkg(input))).

But that would make the evaluation order at odds with the behaviour of --require foo --require bar, which are evaluated left-to-right 🤔

bmeck commented 2 years ago

I was thinking the other way around: each loader could be passed an indirection to "the next loader in the chain".

Indirection is good, but I don't think it should a hook from a parameter, some kind of construction dependency injection or contextual global seem better since it keeps hook API signatures the same as calling the next loader.

calling these functions would forward to the default implementations, because there aren't next loaders yet. Then, as they get evaluated, they are replaced by the true next one.

I'm not sure I understand, why would you ever want it to change behavior as a race condition?

I'm confused, why would you want the next loader instead of the previous one?

"next" / "parent" / "previous" / "child" are all bad terms, we never really found a good description for something closer to the default implementation vs farther and closer to the callsite that started the chain. We saw various descriptions of why things should be called contradictory terms but I do remember we had an old doc on this: https://docs.google.com/document/d/1J0zDFkwxojLXc36t2gcv1gZ-QnoTXSzK1O6mNAMlync/edit#heading=h.itxprzdnhhbo We should probably revisit since that is outdated and agree on terms, can't really agree until the type of composition is agreed upon though. Iterative vs recursive could make the terms more confusing if we cement them now but have to change them after.

In this case we are getting a reference to the loader closer to the default behavior / the one wrapping the default that we must call in order to call the default.

GeoffreyBooth commented 2 years ago

But that would make the evaluation order at odds with the behaviour of --require foo --require bar, which are evaluated left-to-right 🤔

The intent is to match --require in that --loader a --loader b has A’s hooks run first and A’s next is B’s hook, and B’s next is the default hook. Expressed in JavaScript terms this is default(b(a(input))) which reads in the opposite order, yes, and makes this hard to explain 😄

ljharb commented 2 years ago

that makes sense to me if you have a "next" model - ie, where a loader is expected to leave itself extensible by explicitly asking for the next one in the chain, a la express middleware.

An alternative would be a "wrap" model - where a loader can choose to rely on previously-established resolution or not, but it can't prevent successive loaders from making the same choice. In other words, loader A would be unable to prevent B from supplanting it - the last one to run wins, just like in JS. What are the reasons not to go with that sort of model?

guybedford commented 2 years ago

In the current model, loaders actually have their own module registry separate to the main loader. So there is no concept of loader loaders I believe. Note one way around this is just to build loaders with Webpack or otherwise to be a single file that includes all the resources and that would likely result in the best startup time anyway since you won't be caching eg TypeScript between the loader registry and main registry anyway.

ljharb commented 2 years ago

That's fine if you're publishing a loader that composes others, but how would end users themselves compose loaders? It seems pretty unreasonable to expect an application user to pull in a bundler any time they want to use more than one loader together.

I'm thinking of the current "foo-register" model, where i can handle any unhandleable format by transforming it into a handleable one - so for example, in a node that doesn't understand esm, i can use all the loaders out there for "X to ESM" as long as i've first pulled in one that does "ESM to CJS".

bmeck commented 2 years ago

@ljharb we have had multitude of meetings on this exact topic, feel free to review or attend for more questions but this seems to be moving off the topic of the original issue. @guybedford is just mentioning a work around for the normal flow. Indeed it will be even more so isolated when the loaders are moved off thread as they won't even share a single loader module map when registering multiple loaders.

dygabo commented 2 years ago

I was thinking the other way around: each loader could be passed an indirection to "the next loader in the chain".

@arcanis: I find this a good idea and also think that this or similar mechanism is necessary for chains with length > 2 (i.e. more than one loader and the default) since it would allow one loader to delegate to the rest of the chain. Otherwise encapsulation and wrapping existing functionality might get complicated

GeoffreyBooth commented 2 years ago

Per #58 we’re going to keep this as an open feature request but defer it for now, at least until chaining has landed (as this essentially another level of chaining).

nodejs / loaders

Loader chaining while loading loaders #56

nodejs / loaders

Loader chaining *while loading loaders* #56

Loader chaining while loading loaders #56