Proposal: Rename “loaders”

GeoffreyBooth commented 2 years ago

In many of the discussions on this repo and in https://github.com/nodejs/node, we’ve considered various parts of Node that users might want to customize beyond just module resolution and loading. Within this repo one major use case discussed has been virtual filesystems, which would require hooking into many (if not all) of the fs methods. In https://github.com/nodejs/node/issues/43818#issuecomment-1184415188 @cspotcode has compiled a great (incomplete) list of places where someone adding support for TypeScript to Node might want to customize Node:

repl
- tab completion (current api insufficient)
- top-level await (mechanism not exposed)
inherit configuration blob to worker_threads and child_processes (currently requires hacks)
stack trace support (fix bugs & perf in node’s current implementation)
stack trace cache setter (haven’t looked at this one in a while)
proper APIs into node’s CJS exports discovery mechanism (discover ESM named exports in CJS source code, no way for transpiled code to contribute to this process)

I’m thinking perhaps a better name for this repo and the Loaders API itself would be “hooks” / Hooks API. This would better reflect even the existing API, since arguably globalPreload has little to do with module resolution and loading. We could also move the loaders documentation out of the ESM page in the docs and into its own page. cc @nodejs/loaders

If the current loaders team likes this idea, I can open an issue on nodejs/admin to make it happen from a technical standpoint.

Update per https://github.com/nodejs/loaders/issues/95#issuecomment-1199833460, here are the new names:

Team: “Customization.” (Likewise for this repo, GitHub group, etc.)
Docs: new top-level section “Customization API.” Move “ESM » Loaders » Hooks” to “Customization API » Module Hooks”.
Flag(s): No change for now, but reevaluate as part of moving loaders off-thread.

JakobJingleheimer commented 2 years ago

I think this sounds like a sound idea. It enormously expands our scope/purview though.

GeoffreyBooth commented 2 years ago

I think this sounds like a sound idea. It enormously expands our scope/purview though.

Well I think those other things were already part of our scope 😄 just informally so.

GeoffreyBooth commented 2 years ago

cc @nodejs/tsc for awareness and permission. I can give an update at the next TSC meeting.

mcollina commented 2 years ago

I'm +1 on a terminology change, but I'm not sure we can use hooks as we already have async_hooks and performance_hooks. These might be loader_hooks possibly?

JakobJingleheimer commented 2 years ago

I'm not sure we can use hooks as we already have async_hooks and performance_hooks.

Ah, good shout. I definitely don't want to be responsible for asnyc_hooks 😵‍💫

cspotcode commented 2 years ago

Should definitely consider a single CLI flag that is empowered to make all necessary runtime customizations. It fits the goals of nodejs/node#43818 and nodejs/node#43408: runtime customization with good user-experience, acknowledging that TS support (and other stuff) requires more than just loader hooks. Splitting hook registration mechanisms across multiple CLI flags feels like bad user experience.

GeoffreyBooth commented 2 years ago

I'm not sure we can use hooks as we already have async_hooks and performance_hooks.

Ah, good shout. I definitely don't want to be responsible for asnyc_hooks 😵‍💫

Then customization_hooks? At least for the namespace, if we even need one; and --hooks for the flag?

"Loader hooks" doesn't make sense because many of the types of customization people want to do don't involve loaders.

JakobJingleheimer commented 2 years ago

Then customization_hooks? At least for the namespace, if we even need one; and --hooks for the flag?

Hrrm, I think we can't use --hooks for the reason Matteo cited :( Although, I do like it.

I don't love customization_hooks 😅 hooks are inherently customisation, so that just adds letters. These are things that affect how node sets itself up, so maybe… init-hook, setup-hooks, or startup-hooks?

arcanis commented 2 years ago

integration_hooks? language_hooks?

GeoffreyBooth commented 2 years ago

Hooks aren't inherently customization. Async hooks and performance hooks aren't about customizing Node.

Also the hooks are registered during startup but they apply to the lifetime of the process. I don't see them as related to startup.

Qard commented 2 years ago

Naming things--the hardest problem in computer science. 😅

Seriously though, I think the current "loaders" naming is fine until we actually specify how the system is expanding. The eventual shape of this subsystem is very nebulous. I don't think the right name for it will be clear until that lack of scope clarity is resolved.

GeoffreyBooth commented 2 years ago

I don’t think the right name for it will be clear until that lack of scope clarity is resolved.

I think we know how it’s expanding; I wrote a list in the top post. We don’t need to map out the specifics of each of those items to know that we’re essentially creating opportunities for users to give Node functions to call at critical points in Node’s internal flows, where those functions change Node’s behavior in some desirable way.

Historically, customization of Node’s behavior has been done via monkey-patching; CommonJS made it possible to override publicly accessible Node APIs with custom versions, and so that became the go-to (and for many things, only) way to customize. ESM put a stop to that, not because the Node team wanted to clamp down on this but because the ESM spec itself meant that all the code in an application needed to be resolved and parsed before any of it was executed, and so therefore any code that attempted to monkey-patch would be “too late” in that what it’s attempting to override would have already been loaded. Hence loaders were born, initially focused on restoring this lost monkey-patching ability.

As time goes on, though, it’s become clear that monkey-patching—whether the CommonJS version or the ESM loaders version—is a blunt tool. It’s better to provide intentional APIs for specific purposes, like Error.prepareStackTrace rather than overriding whatever “print stack trace” function is within Node. The current loaders API provides a great way to do so: define functions with specific names to tell Node where you want custom code registered to either override Node’s version of a function or supplement part of Node’s flow (like by mutating input before Node accepts it or by mutating output after Node produces it). Even the current loader hooks, though, do much more than extend Node’s “loading” of files: globalPreload is about injecting custom global-scope code, and use cases like instrumentation are far afield from the traditional loader use case of supporting an additional file type.

So in my mind, the current loaders API is really a “register these functions to alter Node’s behavior at specified points” API. And as time goes on we’ll be adding support for more named functions beyond resolve, load and globalPreload. Hence the desire for a name that better reflects what this API is.

@mcollina I don’t know if we’re ever likely to need an API namespace like async_hooks; the loaders API currently doesn’t involve anything imported off of any Node builtin, and if we do eventually add some helpers those would probably hang off of the relevant existing API. Stuff involving source maps would probably live on node:module like the current Source Maps API, for example; if we someday provide a hook for customizing DNS resolution I’d think that would live on node:dns, etc. So I don’t think the existence of async_hooks and performance_hooks is necessarily disqualifying, though I agree that perhaps “Hooks API” by itself is too generic. If we’re going to qualify it, though, we shouldn’t make the mistake of being too constraining since this API could very easily be extended to allow customization of any corner of Node.

Somewhat related is this issue for Deno someday adding support for types of transpilation other than TypeScript: https://github.com/denoland/deno/issues/1739. They just call it a “public API,” which is perhaps too generic. You could also argue that maybe these functions are essentially plugins, and so what we’re really providing here is a Node Plugins API. (Or middleware, or injected functions. . . .) Another source of inspiration that feels similar to me is WordPress Hooks: https://developer.wordpress.org/plugins/hooks/. They work actually quite similarly to our hooks, and there’s a lot of language on that page that could provide ideas for names:

Hooks are a way for one piece of code to interact/modify another piece of code at specific, pre-defined spots. They make up the foundation for how plugins and themes interact with WordPress Core . . . There are two types of hooks: Actions and Filters. To use either, you need to write a custom function known as a Callback, and then register it with a WordPress hook for a specific action or filter. Actions allow you to add data or change how WordPress operates. Actions will run at a specific point in the execution of WordPress Core, plugins, and themes. . . . Filters give you the ability to change data during the execution of WordPress Core, plugins, and themes.

Qard commented 2 years ago

It sounds like you're trying to adapt loaders to do what diagnostics_channel already does. APMs are already using diagnostics_channel to replace monkey-patching. Our use of loaders is somewhat a temporary hack to fill the gap while transitioning the ecosystem away from needing monkey-patching to in the future supporting diagnostics_channel events.

The main difference between that and what you propose is the addition of being able to modify behaviour, which was explicitly avoided by diagnostics_channel for security and stability reasons. It was our assessment that it made more sense to make more narrowly-focused APIs in any places that needed to change behaviour. Node.js is full of years of attempts at all-encompassing systems collapsing under their own weight and becoming unmaintainable to all but the extremely dedicated. Domains is one example, async_hooks is another, and AsyncListener before that.

I would focus on stabilizing loader hooks before considering expanding the scope even further to include an even broader collection of things, which will likely drag on experimental status indefinitely as there will always be more use cases. If you still intend to kick off other efforts I would intentionally bucket them separately, even if they're connected features, just to keep focused scope around each effort and have separate paths for stabilization of each so they're not cross-dependent in ways that could potentially deadlock progress in the future.

GeoffreyBooth commented 2 years ago

I would focus on stabilizing loader hooks before considering expanding the scope even further to include an even broader collection of things, which will likely drag on experimental status indefinitely as there will always be more use cases.

Agree on this. But before making loaders non-experimental, we need to pick a name. If it’s going to be something other than the Loaders API, then the --loader flag will need to change; and that should happen while this is still experimental.

Earlier today I landed #97 to update our roadmap: https://github.com/nodejs/loaders#status. I divided our to-do list into two phases based on what I thought was needed for us to consider ESM to be at parity with CommonJS, which was the point of the loaders API. We’re pretty close to that milestone (the first set of tasks). We could possibly move this API from experimental to stable at that point, assuming we can settle on a permanent name.

Qard commented 2 years ago

I don't see why the flag would need to change. It's for defining loaders. What we're talking about is something entirely different.

JakobJingleheimer commented 2 years ago

For what we currently call "loaders", what about --esm-hooks or --module-hooks (it's both literally a module of hooks and hooks for modules) for the flag and then remove "loaders" completely as a wrapper for the hooks? The "loaders" section of the ESM docs has basically 1 child, "hooks":

Loaders
- Hooks
- resolve(specifier, context, nextResolve)
- load(url, context, nextLoad)
- globalPreload()
- Examples
- HTTPS loader
- Transpiler loader

The name loader is confusing because one of its hooks is called load; we already mostly refer to the hooks individually (eg load hooks or now load hooks chain).

Then for subsequent hooks, be they fs hooks, or what-have-you, cross those bridges when we come to them? Or, are we thinking we might want to glob all the hooks in the same file? eg if we ever add fs hooks, they would live in the same file as ESM hooks?

export async function resolve() {}
export async function load() {}
export async function readFile() {}
// …

If that's the case, I think the flag should merely be --hooks and we say the async_hooks and perf_hooks are an unfortunate coincidence (they wouldn't go in that file anyway, so I think not too much confusion there).

Our team name could be "Hooks" or something—collaborators would all know that we handle all the hooks except async_hooks and performance_hook.

GeoffreyBooth commented 2 years ago

Or, are we thinking we might want to glob all the hooks in the same file?

Yes, this was my thinking. That file could import other files, but ultimately the API is this top-level file that exports functions of defined names. It makes sense to group them all together because something like TypeScript will use many of them.

mhdawson commented 2 years ago

@GeoffreyBooth a bit off topic but is https://github.com/nodejs/node/pull/41076 the current status for file system hooks or are there other issues I should go looking for?

GeoffreyBooth commented 2 years ago

@GeoffreyBooth a bit off topic but is nodejs/node#41076 the current status for file system hooks or are there other issues I should go looking for?

Yes, as far as I’m aware. @arcanis can confirm.

GeoffreyBooth commented 2 years ago

Our team name could be “Hooks” or something—collaborators would all know that we handle all the hooks except async_hooks and performance_hook.

We could also maybe have a flag name like --register, in the spirit of the old CommonJS loaders like require('coffeescript/register') and require('@babel/register'). So --register ts-node/hooks would be like saying “register the TypeScript hooks” which corresponds with the use case, I think.

Then the API itself could be named something more verbose, whether Customization Hooks or Plugin Hooks or Process Hooks (because they affect the Node process) or whatever else. Or just Hooks if everyone is okay with that.

arcanis commented 2 years ago

For now I've paused #41076 to first see whether the helper API, together with node:fs overrides via loaders, is enough for the use case I was pursuing.

cspotcode commented 2 years ago

Then for subsequent hooks, be they fs hooks, or what-have-you, cross those bridges when we come to them? Or, are we thinking we might want to glob all the hooks in the same file? eg if we ever add fs hooks, they would live in the same file as ESM hooks?

This is related to user experience. If those hooks require a different CLI flag, then node --hooks ts-node becomes more complex than it needs to be. Put another way, if the user knows they want to use a one-stop-shopping solution to a particular problem, they won't enjoy if node decided to split hooks across multiple CLI flags.

aduh95 commented 2 years ago

I'm not convinced renaming the --loader flag to --hooks is worth it, maybe we'd be better off by using a different flags for hooks that have nothing to do with loaders – or even reuse an existing flag (e.g. --require or the soon to be --import). IMHO, anything that can't work from the loaders thread would at a odd place in the same "loaders module", it would be a mistake to build a design that forces us to parse and execute twice the file (once on each thread) – it would be very much unexpected that code in the top-level of the module would run twice.

cspotcode commented 2 years ago

I like the idea of a --hooks or --register that runs in the main thread at startup and exposes various hooking and runtime customization APIs. One of those APIs can be registerLoader(moduleSpecifier, configuration) to install a loader.

Avoids the double-execution, avoids the need for users to pass multiple CLI flags, allows programmatic bootstrapping in the main thread, so it allows config file resolution and whatnot.

GeoffreyBooth commented 2 years ago

In the recent loaders meeting, @aduh95 pointed out that once loaders move off-thread, we can’t have a single user file like --loader foo.js that contains both module hooks like resolve and non-module (and therefore main thread) hooks like whatever we create for customizing the REPL. (Or rather, we could have a single user file with both types of hooks, but it would be evaluated twice, which feels bad.) So I think the least-bad solution for this would be to have two flags, like --hooks and --module-hooks, for the main thread and loader thread hooks.

The docs describe all of Node’s flags and supported environment variables in a section called “Command-line API.” I think once we build support for some way of specifying those options via a config file (#98) that section should probably be renamed something like Configuration API and include flags, environment variables and config files. Then we could create a new section called Customization API with subsections Module Hooks, REPL Hooks, Source Maps Hooks and so on, for all the parts of Node where we would support customization via user-supplied functions. This team could become the Customization Team, to include the scope of customization across Node, or the Module Hooks team to stay focused on customization particular to the module systems (CommonJS and ESM). Thoughts?

cspotcode commented 2 years ago

The main thread hooks should still be able to programmatically register and configure loader hooks on the user's behalf. Or rather, a single declaration should be all that's needed for a user to opt-in to a suite of hooks. Otherwise user experience is bad in the "one-stop shop" use-case. (https://github.com/nodejs/loaders/issues/95#issuecomment-1189325828)

GeoffreyBooth commented 2 years ago

The main thread hooks should still be able to programmatically register and configure loader hooks on the user’s behalf.

If this is possible, by all means, sure. The “two flags” approach is only if we can’t get a single flag to work.

Anyway, what about the rest? I’d like to wrap up this discussion and reach a decision.

Team: “Customization.” (Likewise for this repo, GitHub group, etc.)
Docs: new top-level section “Customization API.” Move “ESM » Loaders » Hooks” to “Customization API » Module Hooks”.
Flag(s): If we can get away with only one flag, --register. If we need two, --register and --register-module-hooks. (We can punt this decision for now until we move loaders off-thread and discover if we actually need two flags or not, and come up with their names then.)

What do you all think?

cspotcode commented 2 years ago

If this is possible, by all means, sure. The “two flags” approach is only if we can’t get a single flag to work.

Just a nit, but in my mind, the two flags will always exist, because they serve an important use-case: avoiding unnecessary execution in the main thread. But it should also be possible to apply both kinds of hooks with a single flag. Not sure if we already agree on that or if it is worth clarifying.

GeoffreyBooth commented 2 years ago

Could we perhaps resolve this discussion without needing a meeting? Slight amendment to my last comment per https://github.com/nodejs/loaders/issues/105#issuecomment-1199810967:

Team: “Customization.” (Likewise for this repo, GitHub group, etc.)
Docs: new top-level section “Customization API.” Move “ESM » Loaders » Hooks” to “Customization API » Module Hooks”.
Flag(s): No change for now, but reevaluate as part of moving loaders off-thread.

Please reply with:

👍 if you approve
👎 if you prefer the status quo
👀 if you’d prefer some other alternative

GeoffreyBooth commented 2 years ago

Last call for votes ☝️ otherwise I think we have a winner.

GeoffreyBooth commented 2 years ago

Closing per https://github.com/nodejs/loaders/issues/95#issuecomment-1213202138. I think we can consider the next steps as listed in https://github.com/nodejs/loaders/issues/95#issuecomment-1199833460 to be approved. I’ll try to get around to the administrative ones (repos, etc.) soon; if anyone wants to tackle the docs PR, please drop a comment here letting everyone know you want to work on it.

nodejs / loaders

Proposal: Rename “loaders” #95