Proposal for single-mode packages with optional fallbacks for older versions of node

weswigham commented 5 years ago

This is a counter-proposal to nodejs/modules#273.

A bunch of us have brought this up during informal dual-mode discussions before, but the very concept of "dual mode" packages which execute different code for cjs vs esm callers is distasteful. It pollutes a single identifier (the package name or specifier) with two distinct and potentially irreconcilable package identities (which themselves potentially contain duplicated identities and dependencies). From anything other than the perspective of the registry, it's effectively shipping two separate packages for a single runtime - one for esm callers and one for cjs callers.

The original desire for so-called "dual mode" packages derives from two uses:

The ability to support cjs consumers in the newest versions of node.
The ability to retain support for older versions of node, while still supporting esm where possible.

In this proposal, I will tackle the two issues separately.

First, case 1: In the current implementation, a esm-authored package cannot be require'd. This means that you cut support for all cjs consumers when you migrate to esm. This kind of hard cut is, IMO, obviously undesirable, and both the "dual-mode" proposal and this proposal seek to remedy this. In the "dual-mode" proposal, this is solved by shipping seperate cjs code alongside the esm code, which is loaded instead of the esm code. In this proposal, the esm code itself is loaded. This means that when a package specifies that it has an entrypoint that is esm, it will only ever be loaded as esm. The astute in the crowd would note that while yes, that's all well and good, the cjs resolver is synchronous, while we've specified that the esm resolver is asynchronous - a seemingly irreconcilable difference. I arrive to tell you something: this is not so. An esm-based require may be executed async, but appear to be synchronous to the cjs caller - similarly to how child_process.execSync works today (and, in fact, using similar machinery). This synchronization only affects instantiation and resolution - the execution phase remains untouched, so if at some point in the future top-level await becomes a thing, depending on variant, either the require can conditionally return a promise (if TLA is supposed to be blocking) or happily return the exports object while the module is still asynchronously executing. The only other concern would be the observable affects on user-defined loaders, which, if we follow through on that design with out-of-process (or at least out-of-context) loaders (which are very desirable from an isolation perspective), the solution there, likewise, is simply using an apparently synchronous execution of the async loader, just as with the builtin loader. In-process loaders can also be syncified (and in fact are in my current implementation), but care must be taken to not have a user loader depend on a task which is meant to resolve in the main "thread" of execution (since it will not receive an opportunity to execute, as only the child event loop will be turned) - this means relying on a promise made before the loader was called is a no-go. This shouldn't be a problem (the builtin loader, despite being written to allow async actions, is actually fully synchronous in nature), and, again, is fully mitigated by having loaders in a new context (where they cannot possibly directly depend on such a task).

By allowing esm to be require'd in cjs, we can close the difference between the esm and cjs resolvers. If the cjs resolver can resolve a file, IMO, in the esm resolver it should resolve to the same thing or issue an error - it should not be possible to get two different identities for the same specifier based on the resolver used (this implies that the esm resolver we use should be a subset of or identical to the cjs resolver). This means implementing knowledge of "type": "module" and the like into the cjs resolver, since afaik, it's not already there (though it does already reserve .mjs).

And case 2: With the above, a package can only have one runtime execution associated with it, however that only requires shipping esm. To support older versions of node, a cjs version of a package must be shipped. An answer falls out naturally from extension priorities: Older versions of node do not search for the .mjs extension. Making your entrypoint a .mjs/.js pair (with .js as cjs), the .mjs will be found by versions of node which support esm, while the .js will be found by older versions of node. This is ofc constrained by what older versions of node already support - there's no "work to be done" to improve this, just work to ensure that it continues to work and that it does not become blocked by some other modification. naturally, this means in a package with a cjs fallback for an older node, the entrypoint cannot be a .js esm file - however other esm in the project can be (by placing a package.json with {"type": "module"} in whichever project subfolder has the esm source beyond the entrypoint). This, too, has been alluded to by many people in many other issues, yet has not yet been written down.

TL;DR:

Synchronous require of esm can be done. Doing so allows an esm to actually replace a cjs one, and alleviates the need for a "dual mode" resolver system. This also greatly aids the migration story (especially when paired with the dynamic modules spec, which allow for more than just the default member to overlap during migration).
Patch the esm (and cjs) resolver to be a proper subset of the cjs resolver and resolve to the same cache entry or an error for each specifier which can be represented in both (cjs obviously does not respect URLs in any way, so any URL resolution remains solely the domain of the esm loader)
cjs fallbacks for older versions of node come free with the extension searching that older versions of node already do (so long as we allow a higher priority entrypoint to be found in newer versions of node, this method of backcompat comes "free"). Without that, an alternate esm entrypoint is useful. Neither is tied to this, but it's worth mentioning that it's still easily doable.

bmeck commented 4 years ago

@weswigham can we close this particular proposal?

weswigham commented 4 years ago

Nope. While conceptually I don't think anything was wrong with the original proposal (indeed, I don't think there were any objections on anything other than technical grounds), it seemed like some people would not be convinced of the implementation (or speculated changes thereupon) until they could see how it actually plays with TLA in practice. Myles looks to be merging an unflagged TLA implementation in the (near?) future (which I've been waiting for); once that's in, a revised implementation that handles TLA should be doable (well, no less than 3 seperate implementations that each handle TLA differently are, anyway). I'm hoping that showing the interop in practice should assuage some concerns.

GeoffreyBooth commented 4 years ago

once that's in, a revised implementation that handles TLA should be doable

Perhaps we can close this thread and you can open a new one that takes into account the TLA that's merged in?

weswigham commented 4 years ago

Nothing in the OP is out of date - it would be a nearly verbatim repost (as only technical implementation details are changing). There's no reason to.

weswigham commented 4 years ago

Ah, yep, https://github.com/nodejs/node/pull/34558 landed yesterday. I might be able to remake/update https://github.com/nodejs/node/pull/30891 in the hopefully nearish future.

nodejs / node

Proposal for single-mode packages with optional fallbacks for older versions of node #49450