nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.03k stars 29.29k forks source link

Discussion: New “ESM by default” mode #49432

Open GeoffreyBooth opened 1 year ago

GeoffreyBooth commented 1 year ago

Building off of https://github.com/nodejs/node/pull/49295#issuecomment-1699430613, we’re considering a new mode where all of the current places where Node defaults to CommonJS would instead default to ESM. I think this should be enabled by flag, and once it ships we can potentially ship a separate binary where it’s enabled by default; and/or we can someday make the new mode Node’s default in a semver-major change.

The use cases solved by such a mode are (and I’ll edit this comment to update this list as people think of more):

To handle these use cases, the changes that this mode should make are (and likewise I’ll edit this to reflect consensus):

This new flag will be named per the discussion in https://github.com/nodejs/node/issues/49541.

Prior art: https://github.com/nodejs/node/pull/32394, https://github.com/nodejs/node/pull/49295, https://github.com/nodejs/node/pull/49407

@LiviaMedeiros @nodejs/loaders @nodejs/wasi @nodejs/tsc

mcollina commented 1 year ago

I don't think this is feasible to change the default if package.json is present due to the massive amount of modules available on npm.

My 2 cents on how to solve this in a much easier way: let's make npm init add the necessary field in package.json.

GeoffreyBooth commented 1 year ago

I don’t think this is feasible to change the default if package.json is present due to the massive amount of modules available on npm.

I’ve considered this and I don’t think it’s a blocker. As discussed in https://github.com/nodejs/node/pull/49295#issuecomment-1703017541, there are many potential solutions, the simplest of which is to just crawl through subfolders adding "type": "commonjs" to package.json files that lack it. I don’t think this needs to be a worry or something we need to deal with at this point.

Also, the proposal isn’t to go straight to changing Node’s default. That will be months if not years away, once this flag has shipped for a long time and issues have been resolved. The time for considering the impact of changing Node’s default is once this flag has been available for a while and we can see how it works. We may never change Node’s default, and that’s fine; this flag and mode has merit regardless. The point of mentioning that this might lead to changing Node’s default is so that we design with that potential goal in mind, so that whatever we add has a clear way to opt out if/when it becomes the new default.

bmeck commented 1 year ago

@GeoffreyBooth how does adding that field work for people using other languages/processes that spawn child processes for existing codebases? I'm mostly curious since it is a much larger breaking change than most we see and affects all sorts of runners of node applications in non-simplistic ways.

GeoffreyBooth commented 1 year ago

how does adding that field work for people using other languages/processes that spawn child processes for existing codebases? I’m mostly curious since it is a much larger breaking change than most we see and affects all sorts of runners of node applications in non-simplistic ways.

First off, adding a flag isn’t a breaking change. The breaking change would only be if we make the flag the new default behavior, which I don’t think we need to worry about just yet. It’s premature to consider larger ecosystem effects of making this new mode the default when we haven’t designed, much less implemented, the flag to enable this new mode.

The flag is no different than achieving these effects via module customization hooks (the changes that the hooks can achieve, anyway). Just as other languages/processes wouldn’t know how you’re customizing the files in your project when using hooks, they wouldn’t know whether your project is meant to run under this new flag or not. This is a problem we already have, and it’s on those other tools to provide a way for the user to signal to them that they should expect Node to be in this new mode, like how tsconfig.json is used. Shipping the flag long before we change Node’s default allows the ecosystem time to adapt.

One thing we can do to help ease the transition is to start making the "type" field mandatory. As in, you don’t need to have a package.json file, but if you have one, it needs a "type" field. Then at least for applications, which are our dominant use case, there’s still a deterministic way for tools that work with apps to know how all the files within that application should be interpreted. We could make this one of the effects of the flag, so that in the new mode we need explicit package.json types.

bmeck commented 1 year ago

@GeoffreyBooth I'm not opposing the flag in any way, I'm just concerned about the default swap which has been attempted in the past in https://github.com/nodejs/node/pull/32394

bmeck commented 1 year ago

Then at least for applications, which are our dominant use case, there’s still a deterministic way for tools that work with apps to know how all the files within that application should be interpreted. We could make this one of the effects of the flag, so that in the new mode we need explicit package.json types.

I'm more skeptical on this due to files being loaded for config/init like ~/.*rc files outside of the application directory.

joyeecheung commented 1 year ago

I think the idea of having multiple releases in general is not great - if we want to do it, builds with pointer compression enabled would’ve been a thing already and then we would have another headache about what combinations of releases we want to maintain….

A runtime flag that makes ESM the default sounds good to me though, why can’t we just have that and allow it in NODE_OPTIONS? That should be enough for the use cases mentioned in the OP already

GeoffreyBooth commented 1 year ago

I’m more skeptical on this due to files being loaded for config/init like ~/.*rc files outside of the application directory.

There’s an infinitely long tail of ecosystem tools and libraries that don’t support whatever latest and greatest features we ship. Several years ago there were presumably tools that relied on the fact that .js files were CommonJS, and those tools needed to update to check for the type field before making that assumption; this is no different. If we limit ourselves by what today’s tools expect out of Node, then we’d never be able to change anything, even additive/non-breaking changes. Users can make the choice of deciding to use this flag or deciding to use a tool that's incompatible with it.

I think the idea of having multiple releases in general is not great

What do you mean by “multiple releases”?

joyeecheung commented 1 year ago

What do you mean by “multiple releases”?

multiple binaries, e.g. if you can have node-esm, then you can also also have node-pc (pointer compression), then you can also have a combination of node-esm-pc, then you can have…other combinations

GeoffreyBooth commented 1 year ago

What do you mean by “multiple releases”?

multiple binaries, e.g. if you can have node-esm, then you can also also have node-pc (pointer compression), then you can also have a combination of node-esm-pc, then you can have…other combinations

Yes, that’s perhaps an argument against shipping an alternate binary (see https://github.com/nodejs/node/pull/49407). I think we should worry about the flag first before contemplating binaries; I think the flag stands on its own, and maybe there’s a solution for a binary with that flag enabled that doesn’t involve us needing to ship or maintain it, but we can worry about it later.

A runtime flag that makes ESM the default sounds good to me though, why can’t we just have that and allow it in NODE_OPTIONS? That should be enough for the use cases mentioned in the OP already

Shebangs can’t include flags on many platforms. See https://github.com/nodejs/node/pull/49295#issuecomment-1699503964 and linked and following comments. I don’t know if shell scripts have much need for pointer compression, but assuming they don’t, then the shebang case is an argument for a separate binary, as shell scripts being unable to specify flags is a particular constraint that only a separate binary can solve, and hopefully this is the only Node option that such scripts need control over.

Anyway to your larger point, yes, let’s ship the flag first and go from there. Many platforms do support flags in shebangs, and there are other use cases solved by the flag, so we’re still making a lot of progress just with the flag by itself at first.

mcollina commented 1 year ago

As discussed in #49295 (comment), there are many potential solutions, the simplest of which is to just crawl through subfolders adding "type": "commonjs" to package.json files that lack it. I don’t think this needs to be a worry or something we need to deal with at this point.

We are not in agreement. The ecosystem concern is something that needs to be at the center of this design. Changing the default of how a module from node_modules is evaluated is so massively breaking that should be off the table. A node.js that cannot run most of the registry is close to useless. The way this is outlined will create python2 / python3 break, which we should be avoiding at all cost.

Note that I'm not opposed of changing the default when a package.json is not available. I think it would be a significant improvement for shell scripts and the like.. I wish I had it in quite a few occasions. Also: "type": "module" should be the default on npm init at some point in the close future.

GeoffreyBooth commented 1 year ago

We are not in agreement. The ecosystem concern is something that needs to be at the center of this design. Changing the default of how a module from node_modules is evaluated is so massively breaking that should be off the table. A node.js that cannot run most of the registry is close to useless. The way this is outlined will create python2 / python3 break, which we should be avoiding at all cost.

A new flag will not cause a break. Full stop. There is no breaking change proposed here.

The proposed behavior under the new flag will require an extra patch step until package managers catch up and insert the missing type field during package installation. This is not a big change; users can do it themselves via something like npm install && npx add-missing-type-fields. Calling this a Python 2/3 break is extreme hyperbole. I expect that package managers will begin doing it automatically once we ship the flag and they can test against this mode; or package managers will do it automatically when run by Node in this new mode, or at the very least offer an option to do so, saving users from needing to run the patch command. This is not a big deal. No package maintainers will need to update anything if they don’t want to. Even package managers don’t need to update, since we can publish an add-missing-type-fields script that users can run via npx to do the crawl and patch. We could even have Node throw an error upon encountering a typeless package.json: Error: ./node_modules/foo/package.json lacks a "type" field. Patch it by running npx add-missing-type-fields.

In https://github.com/nodejs/node/pull/49295#issuecomment-1699430613 and following comments we discussed the pros and cons of preserving the “missing type field still means CommonJS” behavior in the new mode. Doing so would mean avoiding the need for this patch step and/or avoiding package managers needing to update, but ultimately we wouldn’t be achieving the central goal of making ESM the default: it still wouldn’t be the default if a lack of type field still means CommonJS. And since the fix is so simple, that someone could probably implement in a Bash one-liner or a dozen lines of JavaScript, I think the patch approach is better as it allows us to get to a full ESM default without caveats. And once package managers catch up there won’t be any work for users to do.

There are other approaches. We could treat node_modules as a special case, where in this new mode, subfolders of node_modules preserve the “missing type = CommonJS” behavior. Another idea is that in this new mode, a missing type could be treated as a fallthrough, and Node continues to search up the tree for a package.json with a type, and the user could put a package.json with "type": "commonjs" in the node_modules root folder. I’m sure there are other variations people can think of. Personally though I think it’s better to have the same behavior on both sides of the mode, and just have the flag flip the default: it’s simpler, easier to reason about, and easier for other tools to follow. Yes it means running a patch script, at least at first. That’s a small price to pay, in my opinion, and short-term.

Also: "type": "module" should be the default on npm init at some point in the close future.

Or at the very least, it should start including "type": "commonjs" today and support npm init --module ASAP for the ESM version. But these things are a bit out of scope. Many things about making ESM the default get easier with a supportive package manager, but I want to try to design this so that such buy-in isn’t required, so that we don’t need to depend on every popular package manager signing on to a particular new design.

mcollina commented 1 year ago

A new flag will not cause a break. Full stop. There is no breaking change proposed here.

What's the point of creating a mode of running Node.js that cannot run the majority of the registry?

I expect the .js to become recognized as an ESM by default some day, and this behavior as depicted here is unfeasible due to the massive ecosystem breakage.

The proposed behavior under the new flag will require an extra patch step until package managers catch up and insert the missing type field during package installation. This is not a big change; users can do it themselves via something like npm install && npx add-missing-type-fields. Calling this a Python 2/3 break is extreme hyperbole. I expect that package managers will begin doing it automatically once we ship the flag and they can test against this mode; or package managers will do it automatically when run by Node in this new mode, or at the very least offer an option to do so, saving users from needing to run the patch command. This is not a big deal. No package maintainers will need to update anything if they don’t want to.

I think this developer experience is unfeasible. What will happen when new packages that assumes are running in esm-first will be mixed with commonjs packages? Are users supposed to edit them by hand? This would worsen Node.js developer experience significantly and it is a very big deal, considering that the majority of the registry is (still) commonjs.

Moreover, this new mode will break quite a lot of npx and npm create commands, because individual packages would need to be fixed manually.

The ecosystem is already complex enough, and this new mode will actually worsen the situation. We can avoid completely avoid this friction and having "type": "module" there is by far a minor problem as long as it's the default.

Personally though I think it’s better to have the same behavior on both sides of the mode, and just have the flag flip the default: it’s simpler, easier to reason about, and easier for other tools to follow.

I agree. The default should not be changed actually: it has no friction whatsoever for developers. Let's not break everyone.

LiviaMedeiros commented 1 year ago

It seems like there's some confusions there.

Shebangs

IMHO most of linked prior discussions are a bit mislead.

  1. Shebang does support arguments on its own. It works just fine as #!/usr/bin/node --esm. EDIT: there is limitation of only one argument, so it can't be used together with any other option
  2. There is a practice of using env(1) in shebangs that basically works as #!$(which node)
  3. env historically has limitation of not being able to pass arguments, which is fixed in modern GNU coreutils and FreeBSD.
  4. Minimalistic distributions that use e.g. busybox may not have split-string feature in their env.

The solution for portable enough shebang without separate binary is to write #!/usr/bin/node --esm, simple as that.

I don't recall any issues with that; even though very same thing always applied to shebangs with --experimental-wasm-modules, --allow-fs-write, or any other option.

Separate binaries

There is a difference between separate binary that is compiler with different source code and/or different compile options, and a copy of the very same binary that has different name.

The first comes with downsides such as increased build time and increased release size.

The second (which is proposed in https://github.com/nodejs/node/pull/49407) comes with restriction that the second binary must have specific filename.

For common real life example of it, one may run find "$(git --exec-path)" -samefile "$(which git)". If this prints about 137 filepaths, you found it: all the git something commands are apparently separate files referring to same inode, AKA hardlinks. If it doesn't, ls -l "$(git --exec-path)" will probably show a waste of space with copies. Symlinks would also work, as long as we don't resolve them (e.g. process.execPath).

The implementation comes with almost no overhead, and node-esm doesn't even have to be included in release build; it's a job for the installer that places node in /usr/bin/.

Breaking change

No, adding new optional command line flag that is disabled by default does not break anything for anyone by itself. It's just a feature that will instantly benefit anyone who already prefers ESM-first code.

And yes, blindly running everything in this mode with Node.js ecosystem that we have right now is likely to break.

This is not only intentional, this is the intent. We must be able to enable it and see what happens. We must ship it so any developer can enable it and see how it works with their projects. We must ship it so any user can enable it and see what breaks in the packages they use. We must ship it so any maintainer can see which dependencies are breaking.

Only after that, we can decide if it's feasible or not to give everyone time to migrate, then introduce --cjs flag and maybe deprecate the default-cjs behaviour, then give everyone more time to migrate, then maybe change the default to ESM-first.

mcollina commented 1 year ago

This is not only intentional, this is the intent. We must be able to enable it and see what happens. We must ship it so any developer can enable it and see how it works with their projects. We must ship it so any user can enable it and see what breaks in the packages they use. We must ship it so any maintainer can see which dependencies are breaking.

Only after that, we can decide if it's feasible or not to give everyone time to migrate, then introduce --cjs flag and maybe deprecate the default-cjs behaviour, then give everyone more time to migrate, then maybe change the default to ESM-first.

This is where we are not in agreement, and any plans that include massive ecosystem breakage should be avoided. I'm strongly -1 on this proposal as currently framed. Moreover, I think any plan involving the discontinuation of commonjs should be avoided.

tniessen commented 1 year ago

Shebang does support arguments on its own. It works just fine as #!/usr/bin/node --esm.

To be precise, shebang on Linux supports only a single argument. Anything beyond that needs a modern env.

bmeck commented 1 year ago

One thing not mentioned here as well is tooling needing out of band info for flags/separate binaries like this; tsconfig/eslintrc/babelrc etc. would need to account for this flag since it wouldn't be statically analyzable and/or can be varied per entrypoint or in multiple modes for a given directory:

/bin/newer -> #!/usr/bin/env node-esm or node --default-operating-type=esm
/bin/older -> #!/usr/bin/env node

These need not be in shebangs either, these can be sources out of band just by invocation of the executable, package.json#scripts, etc. This tooling problem should probably be considered with them in this discussion as going completely out of sync of tools was a struggle for exports but that was at least statically analyzable.

LiviaMedeiros commented 1 year ago

This is where we are not in agreement, and any plans that include massive ecosystem breakage should be avoided. I'm strongly -1 on this proposal as currently framed. Moreover, I think any plan involving the discontinuation of commonjs should be avoided.

Any suggestions then?

aduh95 commented 1 year ago

@LiviaMedeiros you are demonstrating that passing a single argument works, it would be more interesting to show wether passing a second argument doesn't work.

This is where we are not in agreement, and any plans that include massive ecosystem breakage should be avoided. I'm strongly -1 on this proposal as currently framed. Moreover, I think any plan involving the discontinuation of commonjs should be avoided.

Any suggestions then?

I guess we'd need to find a way to either restrict the effect of this flag to dev-only (a bit like the Buffer constructor deprecation warning is emitted only outside of node_modules folder), or hide it behind a build flag.

LiviaMedeiros commented 1 year ago
$ cat b
#!/bin/echo a b c d e --amParam --amOtherParam
$ ./b amArg1 amArg2 amArg3
a b c d e --amParam --amOtherParam ./b amArg1 amArg2 amArg3
aduh95 commented 1 year ago

(We're off-topic here, but the above doesn't really show if it's taken as one argument, or several)

$ cat b
#!/usr/bin/node --no-warnings --abort-on-uncaught-exception
console.log(process.argv)
$ ./b test
/usr/bin/node: bad option: --no-warnings --abort-on-uncaught-exception
mcollina commented 1 year ago

Any suggestions then?

I think asking the package managers to add "type": "module" when initializing new projects achieve most of the same results. As I mentioned earlier, not changing the default "type" in package.json will avoid most if not all of the ecosystem breakage.

LiviaMedeiros commented 1 year ago

@tniessen @aduh95 my bad, shebang indeed interprets parameters single argument with spaces. Thanks for pointing this out!

I guess we'd need to find a way to either restrict the effect of this flag to dev-only (a bit like the Buffer constructor deprecation warning is emitted only outside of node_modules folder), or hide it behind a build flag.

Dev-only might work, I guess we can sacrifice performance for that on the first stage.

I think asking the package managers to add "type": "module" when initializing new projects achieve most of the same results.

I think this part is feasible on its own, e.g. marking existing packages without type as using fallback mechanism that should default to commonjs, and setting deadline after which the default fallback value changes to module and/or "type": "module" gets explicitly set by default.

What about files outside of scope? How can we launch non-symlink /usr/local/bin/something.js or /usr/local/bin/something as ESM?

GeoffreyBooth commented 1 year ago

As I mentioned earlier, not changing the default “type” in package.json will avoid most if not all of the ecosystem breakage.

@mcollina If I’m understanding your full comments correctly, this was the only part that you opposed, right?

I have conflicting thoughts on this myself. On the one hand, the more that the flag breaks, the less adoption it’ll get and the longer the timeline to potentially making it the default. On the other hand, like @LiviaMedeiros mentioned, the more that it breaks behind a flag, the more potential gain: if people try out the changes and like them, that builds momentum for the ecosystem adapting.

So I guess the question is, what’s the benefit of changing how “typeless package.jsons” are treated? The only one I can think of is symmetry between the on and off versions of this flag, which is another way of saying user expectations. I can’t think of any use cases enabled by flipping this default. So sure, we can preserve “no type = CommonJS” if we want, and it makes the flag a bit harder to explain but it avoids a set of other problems, the things I was saying would need a patching script for. And maybe that’s what we should do, I don’t know.

There’s no way this flag won’t land without --experimental- in its name. So I also think that we’re free to land the most ambitious version first, just to see how people react and use it, and I’m sure there will be changes before the flag leaves experimental status (long before it becomes a new Node default). We could even split out the “treat typeless package.json as CommonJS” part into a separate flag, so that the primary flag enables everything else and another experimental flag enables that particular behavior, to see if the typeless change has benefits of its own. I don’t think we should be putting down hard blocks against these types of experimentation.

What will happen when new packages that assumes are running in esm-first will be mixed with commonjs packages?

I was thinking that in the new mode, type would be required in package.json files when they’re present. So anyone publishing a package assuming ESM first would still be forced to add "type": "module", keeping it compatible with the legacy mode.

Though I acknowledge that this breaks the symmetry, so maybe there’s no value in flipping how typeless scopes are treated. It would be an unfortunate “this is always this way for legacy reasons” thing.

One thing not mentioned here as well is tooling needing out of band info for flags/separate binaries like this; tsconfig/eslintrc/babelrc etc. would need to account for this flag since it wouldn’t be statically analyzable

I don’t think this is much of an issue. Most tools that I’m familiar with have moved away from the .babelrc pattern into ones like vite.config.js where the file extension behaves like any other file in that scope. And again, we can ship an experimental flag and see which tools are broken under it, and see what those tools say when they open issues with us; would there be a path for them to get their tool working in the new mode, or not? This is part of the point of the experiment.

bmeck commented 1 year ago

@GeoffreyBooth they interpret the type of files. I'm not concerned about tsconfig.json being interpreted in a new way but about TS not being able to determine the type of a module without "yet another config option". Even with more config options the entrypoint of a program could alter the type of a file to superimpose the 2 modes of operation.

mcollina commented 1 year ago

@mcollina If I’m understanding your full comments correctly, this was the only part that you opposed, right?

Yes. There are a lot of technical details in the other points that needs to be ironed out, but nothing major.

The goal should be that the vast majority of modules should work as-is in this new mode. The ecosystem still downloads very old stuff all the time, with pinned dependencies.

I have conflicting thoughts on this myself. On the one hand, the more that the flag breaks, the less adoption it’ll get and the longer the timeline to potentially making it the default. On the other hand, like @LiviaMedeiros mentioned, the more that it breaks behind a flag, the more potential gain: if people try out the changes and like them, that builds momentum for the ecosystem adapting.

What is the gain exactly in this case? Changing the default type in package.json will generate a lot of pain. This is the part that I do not understand, why should we break people modules?

So I guess the question is, what’s the benefit of changing how “typeless package.jsons” are treated? The only one I can think of is symmetry between the on and off versions of this flag, which is another way of saying user expectations. I can’t think of any use cases enabled by flipping this default. So sure, we can preserve “no type = CommonJS” if we want, and it makes the flag a bit harder to explain but it avoids a set of other problems, the things I was saying would need a patching script for. And maybe that’s what we should do, I don’t know.

Essentially I'm prioritizing stability of the ecosystem. I don't think it makes the flag harder to explain, as doing that keep most packages running in the same way.

GeoffreyBooth commented 1 year ago

Changing the default type in package.json will generate a lot of pain.

I don’t think it would necessarily generate a lot of pain. Here’s a script that goes through and fixes missing "type" fields in all the package.json files in the current folder tree:

import { cwd } from 'node:process'
import { opendir, readFile, writeFile } from 'node:fs/promises'

let patched = 0

for await (const entry of await opendir(cwd(), { recursive: true })) {
  if (entry.isFile() && entry.name === 'package.json') {
    const packageJson = JSON.parse(await readFile(entry.path, 'utf8'))
    if (!('type' in packageJson)) {
      packageJson.type = 'commonjs'
      await writeFile(entry.path, JSON.stringify(packageJson, null, 2))
      console.log('Patched', entry.path)
      patched++
    }
  }
}

console.log(`Added "type" field to ${patched} package.json files that lacked it.`)

We could publish this to npm so that users could run it as an npx command like npx patch-typeless-packages and voilà, installing packages in the new ESM-first mode becomes npm install && npx patch-typeless-packages. A bit clunky? Yes. Python 2/3? No.

Alternatively I can see package managers taking the position that they disapprove of the tactic of patching packages after installation; that what users run should be what authors published, verbatim. And so no package manager builds in support for a script like the one I just wrote, even as an option. Hopefully they do something to ease the pain, though, like maybe adding an option to error on attempting to install type-less package. Now we’re at least in the same universe as Python 2/3, in that package authors would be pressured to update their packages to add this field, but it’s literally a one-line change that they need to make and it doesn’t require any coding knowledge or testing. So there would be some churn, but it’s not an ecosystem earthquake.

So basically we know how painful it will be on Day 1 when we release this experimental flag, but not how painful it would be on the day we want to make it Node’s new default; that depends on how the ecosystem reacts and adapts. And we can always revise this behavior while the flag is experimental.

This is the part that I do not understand, why should we break people modules?

I think a better way of putting it is, is the clunkiness/churn/pain worth changing how type-less package.json files are interpreted? @mcollina seems to think that the pain level is high and the benefit of the change minimal to nonexistent, which makes the decision easy. I’m more of the view that the pain level is low—not negligible, but low, and hopefully decreasing to negligible with ecosystem support—but the benefit is also low. If the pain can be eliminated, then even a low benefit becomes worthwhile.

I think the strongest argument goes something like this: Currently, all of Node’s server-side competitors support ESM out-of-the-box by default with no configuration required. (And TypeScript, for that matter, but that’s another story.) We’re losing the “new user” UX story because compared to Bun or Deno, it takes a few steps before users can just get going with writing code in Node that uses import and export. Having npm init add "type": "module" by default would help a lot, sure, but that’s out of our control and npm has rejected all such suggestions repeatedly. Maybe that would change if this new mode started making "type" a required field, adding some pressure on them, but who knows. But that’s ultimately the ideal: everything “just works,” both installing any package as well as writing ESM even in a typeless package. In other words, all gain and no pain. We can get there if we get some support from package managers, but if not then we’ll need to compromise.

muzuiget commented 1 year ago

I +1 for ESM by default, and want to see it in NodeJS 20 ASAP.

If there are breaking changes for exists code, why not keep using NodeJS 18 LTS?

Qard commented 1 year ago

I don't see any reason why we would need to change the type-less behaviour from the current commonjs when we can just get npm init and equivalents to include "type": "module" by default. If the json property is being set by default there's really no reason for us to risk breaking old modules to change the type-less default. Just go forward with getting package managers to always include the type in the future and assume it's old and therefore commonjs if not present.

As for running without a package.json, given the shebang problem I think it would be simplest and cleanest to just require the mjs extension, which already works today.

GeoffreyBooth commented 1 year ago

when we can just get npm init and equivalents to include "type": "module" by default.

We can’t, that’s part of the problem. It was proposed and rejected in 2019, then proposed and rejected again in 2021, then proposed again in 2022 (still open) and again in 2023. From what I can tell most voices seem to be in favor, though there’s one strongly opposed commenter who is extremely prolific in campaigning against the change. Though npm's track record on this relatively minor issue leaves me pessimistic in terms of how supportive a partner they will be in helping us migrate to ESM by default.

Just go forward with getting package managers to always include the type in the future and assume it’s old and therefore commonjs if not present.

It might be worth asking the npm registry folks if they might start requiring the type field on publish into the registry. I assume a package.json file itself is already required? Then at least going forward past the “turn on” date for requiring type, any new packages published won’t be presenting this issue.

As for running without a package.json, given the shebang problem I think it would be simplest and cleanest to just require the mjs extension, which already works today.

The request isn’t to do something that’s possible today, the request is to be able to run extensionless shebang scripts as ESM. This has been an ongoing and persistent user request over many years, and I don’t think we can just wave it away with the current workarounds like symlinks that can kinda-sorta-but-not-quite satisfy the user request.

Qard commented 1 year ago

Seemed to me like @ljharb multiple times stated that inclusion of the type field was a good idea. Just that it should be "commonjs" until ESM and the tooling to support it is properly ready. I would not interpret any of that as blocking adding the type field. Seems to me like it's just work no one has stepped up to do, and some have been unreasonable about expecting to make ESM the default there before the ecosystem was ready for that. We can start by always adding "type": "commonjs". Once all the core ESM tooling like loaders are stable we can start asking the user if they want to change from that default to "module", along with some explanation of the implications of doing so. Eventually, when the ecosystem support is there, we can think about switching that to the default.

If we want to be able to run a shebang file itself as ESM, we could just have an alias entrypoint like node-esm which runs node --input-type=module.

ljharb commented 1 year ago

Encouraging people to patch third party code is a nightmare waiting to happen; my project is only supported when untouched, but if node encouraged any package patching - even just the type field - maintainers would rapidly burn out from all the confusing support requests.

ljharb commented 1 year ago

I think there's a much bigger potential for breakage with pursuing a default "type" change - however, is there a reason not to pursue specifically a setting for extensionless files? That would allow it to support wasm, CJS, ESM, JSON, etc.

For example - a package.json flag, and/or a node flag, that asks "what extension should we pretend extensionless files have?" - and defaults to "cjs", for back compat. Thoughts?

GeoffreyBooth commented 1 year ago

I spun off the question of how to handle package.json files that lack a type field into a new issue: https://github.com/nodejs/node/issues/49494. I updated the top post to point to that issue instead of suggesting a particular behavior for that condition. If people don’t mind, let’s please move the discussion of that question over there.

So aside from that question and the other larger one (https://github.com/nodejs/node/issues/49431), are there any other open questions we need to address? Am I missing any other behaviors that this new mode should enable? Are there any other use cases we’d like this new mode to cover?

ljharb commented 1 year ago

Since the motivating use case seems to be “i want an extensionless file to have ESM”, what about an issue for a very focused solution for that use case, which i suggested in https://github.com/nodejs/node/issues/49432#issuecomment-1705608406 ?

GeoffreyBooth commented 1 year ago

The discussions in the various issues seems to have quieted down, and it seems like we might have a consensus. @LiviaMedeiros since you seem eager to make this happen, do you mind putting together a PR? It would be a combination of several of the recent PRs that you posted. I updated the top post to reflect my guess at the consensuses from the various threads. If you could put those together in a single PR, behind the flag with whatever name gets chosen in https://github.com/nodejs/node/issues/49541, then I think we might have something that can land.

ljharb commented 1 year ago

@GeoffreyBooth can you please respond to https://github.com/nodejs/node/issues/49432#issuecomment-1705796605? i don't believe there is indeed consensus yet.

LiviaMedeiros commented 1 year ago

I don't see any overall consensus here, either. In particular, on these two points:

  1. Flag or separate binary. IMHO both, because each has pros and cons. But maybe just one is enough.

  2. Option to provide main entry point as URL. Not discussed at all, although it might be a very controversial feature because it would need to mess with argv[1].

GeoffreyBooth commented 1 year ago

on these two points:

They were part of the original proposal and no one objected, so I'm assuming that everyone is okay with these aspects.

The separate binary we already know is controversial from your other PR, so I would leave that part out. We can start with the flag and then revisit the binary option.

mcollina commented 1 year ago

The second binary seems unfeasible as it would double our install and download side and almost double the work for our releasers. But given it's not me to have to do that work, I leave that objection to the releasers.

richardlau commented 1 year ago

The second binary seems unfeasible as it would double our install and download side and almost double the work for our releasers. But given it's not me to have to do that work, I leave that objection to the releasers.

Ok FTR, as one of the @nodejs/releasers I object for the same reasons as @mcollina.

Qard commented 1 year ago

The second binary could just be a script that calls the original one with all the same input args and additionally the input-type arg. That wouldn't be twice the size at all.

LiviaMedeiros commented 1 year ago
  • The CLI will parse the main entry point as an URL string, similar to how the value of --import is parsed.

I still object this as absolutely not feasible in combination with other parts of this proposal.

Please provide at least an example of how can it work.

GeoffreyBooth commented 1 year ago

this as absolutely not feasible in combination with other parts of this proposal.

I don't understand the concern. In #49295 you implemented this along with other ESM-first behaviors. Wouldn't we do it just like that?

LiviaMedeiros commented 1 year ago

The command line option proposed in https://github.com/nodejs/node/pull/49295 has different form (--module <URL>) which can't work with shebang by design. How is this solved here?

GeoffreyBooth commented 1 year ago

How is this solved here?

This proposal doesn't directly solve the shebang case but it gets us close. Once what is proposed is built, we have two options:

The second option is better but possibly unachievable. The first option we know we can do and it's better than the status quo, in my opinion.

LiviaMedeiros commented 1 year ago
  • I want to write what are typically called shell scripts in ESM JavaScript, where I can have a single extensionless file anywhere on my disk that uses ESM syntax and it doesn’t need a nearby package.json file or a symlink or wrapper file in order to run as ESM.
  • I want consistency in how the CLI handles references to files; currently --import accepts URL strings but the main entry point must be a path string (and therefore can’t be a data: URL or https: URL, such as to work with --experimental-network-imports).

Please provide an example of how these two would work in this mode. Probably an extensionless ESM file that can be launched with node binary (separate or with flag, anything goes), launched directly (assuming we run it on e.g. Ubuntu with latest coreutils), and launched over https (assuming we configured web server to provide all appropriate headers). For simplicity we can assume that it's a stray file: no package.jsons around, no node_modules/, and no subsequent imports.

I'm mainly interested in import.meta.url (i.e. have some ?a=b#c in URL) and process.argv (i.e. have firstArg secondArg, too) values in each case, so it would be nice to demonstrate them whenever applicable.

GeoffreyBooth commented 1 year ago

I’m mainly interested in import.meta.url (i.e. have some ?a=b#c in URL) and process.argv (i.e. have firstArg secondArg, too) values in each case, so it would be nice to demonstrate them whenever applicable.

What would be the desired UX be for query parameters paired with an extensionless file? Like if you had /usr/local/bin/extless, where extless is a JavaScript file with a Node shebang, I assume you’re running this via just extless; if you try to run it like extless?a=b#c the shell is going to complain because it can’t find a file with that name. If you want to pass parameters to an extensionless script, I think you need to use regular CLI arguments like extless --a=b c. Right?

Please provide an example of how these two would work in this mode. Probably an extensionless ESM file that can be launched with node binary (separate or with flag, anything goes), launched directly (assuming we run it on e.g. Ubuntu with latest coreutils), and launched over https (assuming we configured web server to provide all appropriate headers).

I’m not sure I’m following what you’re asking for. You want to run an extensionless file . . . that’s served over HTTPS? Do you mean you want to run a command that begins with node, like node https://a.tld/script? I assume you’re not trying to run just https://a.tld/script as the entire command and you’re expecting your shell to download and evaluate that?

I created a file named extless and ran chmod +x on it and gave it these contents:

#!/usr/bin/env -S node --enable-source-maps --no-experimental-fetch
console.log(typeof fetch)

Then I ran docker run --volume "$PWD":/app --workdir /app --interactive --tty --entrypoint bash node:20 to open a shell within the Node 20 Docker container. Running ./extless prints undefined. If I remove the --no-experimental-fetch, it prints function.

So at least in distros with modern env, that allow env -S, we can pass flags via shebang. Yes there’s still the problem of the distros lacking this support, and I’m not aware of a solution for them other than a new binary; but I wouldn’t want to deny this flag to distros that can take advantage of it, just because there are other platforms that can’t.

If I change the second line to console.log(process.argv) and run it as ./extless foo bar, it prints:

[ '/usr/local/bin/node', '/app/extless', 'foo', 'bar' ]

So this is an extensionless file that can be launched with the Node binary and launched directly, which answers the first two parts of your request. For the third, “launched over https (assuming we configured web server to provide all appropriate headers)” I assume you mean running via something like this?

node --experimental-type=module --experimental-network-imports https://example.com/extless

In this new mode, yes, I would expect this to work. It would ignore the shebang line in the file, since node is the entry point rather than bash or some other shell, and so you’d have to include whatever Node flags you need as part of the command.

It would be simpler test with data: URLs:

node --experimental-type=module 'data:text/javascript,console.log("hi!")'

I would expect this to work, like how this command works today:

node --import 'data:text/javascript,console.log("hi!")' --input-type=module --eval ''

The import.meta.url of a data: URL is just the entire data: URL. I would assume that for an https: URL it’s the https: URL, but that’s a question for --experimental-network-imports.

Does this answer your questions?

LiviaMedeiros commented 1 year ago

No, for starters I need to see both import.meta.url and process.argv for all three cases. The code should be at least

#!<shebang in any form that would enable this mode>
console.log(import.meta.url, process.argv);

Examples that don't have searchstring, hashstring and CLI arguments don't demonstrate how it works. Examples that don't print or assert the target values don't demonstrate it as well.

GeoffreyBooth commented 1 year ago

I need to see both import.meta.url and process.argv for all three cases.

It's not a goal of this PR to have specific values for those. Why not implement it and see what they are? I don't know offhand what they should be, but they don't make much difference to me. The use cases at the top are satisfied no matter what these values are.

Do you have specific needs for particular values? Are there use cases satisfied by some values and not others?