Closed guybedford closed 5 years ago
I have a writeup from conversations in form of a pseudo design document.
The mode effectively marks a directory as an "esm mode" directory, where all ".js" files
What about extensionless bin files that CLI's often use?
Update
There's a mention in the writeup about --format
related to extensionless but I'm not sure why a mode
property couldn't cover it.
The writeup mentions a gotcha with require
and extensionless files.
Does this mean require
can load ESM?
Thanks a lot for your work that might save us from a new javascript extension.
I know this is an entirely minor detail to consider right now, but I would only argue using shorthand, esm
, instead of something more descriptive like esModule
, ecmaModule
or maybe something else but just more apparent.
@jdalton --format
is for entry points that don't have extensions at all / STDIN, it does not alter the default format of file extensions. If --mode
affected those situations it would need to expand scope so that all extension-less files were treated in a specific way. This complication is why it was punted.
@bmeck
This complication is why it was punted.
What about tackling it in the shebang of the bin file like
#!/usr/bin/env node --mode esm
@jdalton punted and as describe in the design doc "bin" files can have extensions even if the command does not which is the common case for using command line wrappers, feel free to PR the --format
flag as described if you feel strongly about needing that specific use case.
punted and as describe in the design doc "bin" files can have extensions even if the
The bin file is opting in to esm with the --mode
in the shebang. It's not executing the command. Node would parse it out. How you handle conflicting signals is a separate issue.
@jdalton --mode
is setting up file extension mapping similar to a MIME DB, --format
is a very different idea which overrides something that doesn't have a well known intention already. We absolutely should not conflate these flags in my opinion.
We absolutely should not conflate these flags in my opinion.
Having it specified in the shebang is a nice descriptive and natural way to tackle extensionless bins.
@jdalton PR up the --format
flag :)
Naw. It doesn't have to be --format
or bust.
@jdalton One problem with putting arguments into the shebang is that it's not super portable. :( https://unix.stackexchange.com/questions/63979/shebang-line-with-usr-bin-env-command-argument-fails-on-linux
P.S.: Shipping something like a nodem
or mnode
shim with node might fix that though.
Or we could try it in this PR, write tests, and see if issues are had.
Or we could try it in a PR, write tests, and see if issues are had.
Not sure I follow - this isn't hard to verify. The following:
$ cat test.js
#!/usr/bin/env node --harmony
console.log(process.argv, process.execArgv);
Works on OSX but won't run on (all) Linux (the same without --harmony
works on both):
./test.js
/usr/bin/env: node --harmony: No such file or directory
Not sure I follow - this isn't hard to verify. The following
Many times ideas are shot down with a quick SO post without actually seeing if, in the given scenario, it is doable or an issue is able to be worked around.
Pinging @ceejbot and @chrisdickinson who had this in their npm proposal, though it was unimplemented (wondering if they ran into similar issues or if they've thought of anything to workaround the gotcha).
P.S.: Shipping something like a nodem or mnode shim with node might fix that though.
That may be it :yum:. I never considered something like that. Kinda neat!
Question: with such a proposal, how would I write a "dual mode package", one which works whether we require
the package or whether we import
it. This is possible in the mjs proposal, but I don't see how you can do it in this proposal.
If we go with this proposal, all library developers that want to migrate to the esm model will need to fork their modules. So we will have moment
and moment-esm
. lodash
and lodash-esm
, uuid
and uuid-esm
, etc...
Given that we're going to live in dual-mode world for a long time, I believe this is not viable.
@giltayar You would still use the .mjs
approach for such a behavior. This PR is ontop of .mjs
and seeks to alleviate common cases that do not need such a behavior since it is already available through .mjs
. A variety of situations do not need to support dual mode packages, such as top level applications that are never meant to be treated as dependencies.
Another question: if I am doing an import "./foo/bar/gar/qux.js"
, would Node need to check whethe there is a package.json
in ./foo/bar/gar
and then ./foo/bar
and then ./foo
and .
? That feels like a performance hit. If it doesn't, how does it figure out what the "package boundary" of the file is?
@giltayar yes it would stat
every directory for a package.json
once per process, and then it can cache it for later. It will need to do that also for other features like per package loader hooks. If startup performance is still a concern I recommend using a bundler, even for Node. This "mode" field is static, cached, and not determined by any runtime API so it should not be difficult to figure out if bundlers want to support it. It is not in a first iteration, but the design document does have a link to the fact that this "mode"
might have more complex file format models as well in the future for things like Flow or JSX. Those designs have been discussed but are not in this iteration until we get more feedback on "mode"
.
@bmeck: OK, so "package boundary" is a new concept in Node. Didn't exist until now, right? (this is not criticism, it's more trying to understand).
And it's the first time, IIRC, that package.json
contains something that influences the running of files "underneath" it. And this concept of "package boundary" will be used in the near future by the loader mechanism for things like transpiling flow/json/ts. Interesting.
@giltayar
OK, so "package boundary" is a new concept in Node. Didn't exist until now, right? (this is not criticism, it's more trying to understand).
Correct, it has existed in userland to some extent under differing definitions/mechanisms, but this is the first time something of its nature has been proposed to come to Node's core.
And it's the first time, IIRC, that package.json contains something that influences the running of files "underneath" it. And this concept of "package boundary" will be used in the near future by the loader mechanism for things like transpiling flow/json/ts. Interesting.
We are still hashing out those workflows details, but thats the plan. The recommendation will remain not to do heavy workflows like completely transpiling on every startup though. Loaders will want to design a cache mechanism across processes, and/or manage to ensure they can be run ahead of time if they want to avoid startup time problems.
@MylesBorins the "mode" effectively sets how a module format is determined - using the extensionFormatEsm extension checks for "esm" mode and extensionFormatStd extension checks for the backwards compatible mode or "commonjs" or "legacy" or "interop" or whatever we want to call that.
The distinctions can evolve over time but the basic changes in "esm" mode are (1) ".js" files within that same mode boundary are treated as ES modules and (2) loading files without file extensions is not permitted.
@guybedford that eventually hit me... commented as such above.
from above
I have mixed feelings about how this enables importing a .js to have different semantic meaning based on meta data in the folder. It seems ambiguous in a way that isn't statically analyzable. Does this mean that if you directly require a file by full path name we need to check the system for a package.json and parse that package.json for the mode every time?
One one hand I really like that it is configurable. We can make a default, and that default can change as the ecosystem transitions between module systems. On the other hand it feels like there may be too much magic and it is hiding the differences in a way that introduces a lot of complexity into the module resolution algorithm (it would to check and crawl up all folders until finding a package.json to know mode for this algorithm afaict).
I think I may prefer the pattern of having mode be global, more similar to the npm proposal, and forcing the use of import.meta.require
for cjs modules when in the esm mode. The keeps the use of import consistent and would allow us to eventually moonlight the complexity of transparent interop at some point by changing defaults while still giving application developers a legacy mode if we don't remove the cjs mode.
With all that being said I'm not 100% how I feel about this just yet and will think on this quite a bit more and discuss with folks before landing on specific semantics. I'm not going to block anything, but I've marked this in-progress
and would like chat more about the long term implications of this implementation before landing.
Thanks for all the hard work on this, it is really stellar stuff!
@MylesBorins
One one hand I really like that it is configurable. We can make a default, and that default can change as the ecosystem transitions between module systems. On the other hand it feels like there may be too much magic and it is hiding the differences in a way that introduces a lot of complexity into the module resolution algorithm (it would to check and crawl up all folders until finding a package.json to know mode for this algorithm afaict).
It crawls up until it finds a package.json file. Usually it shouldn't be incredibly deep, but it might be deep. This behavior is also required for things like per-package hooks into the loader so perhaps isolating all of this magic into one place is a good thing?
I think I may prefer the pattern of having mode be global, more similar to the npm proposal, and forcing the use of import.meta.require for cjs modules when in the esm mode. The keeps the use of import consistent and would allow us to eventually moonlight the complexity of transparent interop at some point by changing defaults while still giving application developers a legacy mode if we don't remove the cjs mode.
Even without transparent interop and with import.meta.require
we still need a flag if we want to disambiguate .js
files, like the one in this proposal. We cannot let consumer be the definer of file format if we want any static analyzability without ambiguity.
Ping @guybedford
@BridgeAR I believe his is on vacation right now, what is up?
Would it help for review and consideration if I removed the top-level --mode
flag here for now? That would bring the diff down by about 150-200 lines I believe.
The main reason I bundled this in is because one of the major complaints with the previous proposal was for the scenario of node x.js
where there is no package.json resulting in unnecessary statting, so the flag provides the mechanism for avoiding that, while also aligning with the npm proposal quite nicely. But it could be deferred to a separate PR if that would simplify.
@guybedford +1 since we can do that separate PR and it seems to be a bigger talking point on this one.
@guybedford If it's not too difficult to split, +1.
Ok I've split out the features here and updated the description above (https://github.com/nodejs/node/pull/18392#issue-291868841) to reflect that this is just the package.json "mode": "esm"
feature now. The separate --mode
flag is at https://github.com/guybedford/node/tree/package-top-mode.
I hope that helps to review, and let me know if there's anything more I can do to help make this easier to consider.
Thanks! I'm going to test it on a project and review.
It looks like this is breaking CJS interop.
Given:
test.mjs
:
import value from 'cjs';
console.log(value);
node_modules/cjs/package.json
:
{
"main": "src/index.js"
}
node_modules/cjs/src/index.js
module.exports = 42;
Current master branch prints 42. This PR throws:
Error [ERR_MODULE_RESOLUTION_LEGACY]: cjs not found by import in file:///test.mjs. Legacy behavior in require() would have found it at /node_modules/cjs/src/index.js
at search (internal/loader/DefaultResolve.js:33:15)
at Loader.resolve [as _resolve] (internal/loader/DefaultResolve.js:71:7)
at Loader.resolve (internal/loader/Loader.js:49:18)
at Loader.getModuleJob (internal/loader/Loader.js:81:40)
at ModuleWrap.promises.module.link (internal/loader/ModuleJob.js:37:40)
at link (internal/loader/ModuleJob.js:36:36)
at <anonymous>
@targos great catch, I've pushed the fix now with a test case. This was a typo missing out the boolean argument copying from https://github.com/nodejs/node/blob/master/src/module_wrap.cc#L575.
After little testing, I think this is doing what I would expect!
@guybedford Could you take a look at https://github.com/image-js/tiff/commit/834c0adc6c20436d4b2a93971f779e59bae0cea3 ? Would it be the right way to update a library to support both ESM and CJS if this PR lands? If so, +1. I want that :)
@targos that would work if you only provide a "main"
entry point to your module. Yes.
@targos glad it seems sensible to you. Nice dual-target approach - yes that works well!
@MylesBorins I definitely agree there can be a deprecation workflow here, although I think it would be best-suited to the package manager to manage this:
"mode": "cjs"
automatically inserted on install if they don't have "mode": "esm"
init
it can be standard to include "mode": "esm"
this is a crucial part of making this workflow easy.I'm adding these features to my own package manager work, and the hope is others could do the same. Then we can start to coordinate that deprecation path definitely.
@targos much appreciated for review, and catching the inconsistencies in the test names. If we can expand that query to confirm the "mode" property is safe, would be keen to add that validation.
@guybedford I guess the question for me is "Does Node intend to eventually change the default"? If the answer is yes, which I hope it is, I think the onus is on us to warn that this behavior is going to change.
@MylesBorins certainly Node could have a warning, but if we added that today we'd get ~100~ 10s of warnings for every single npm package dependency loaded from ESM, with no easy action from the end-user apart from sending a PR. If the package manager could take the first step adding on install that will avoid this painful middle ground, but otherwise I suppose we can go for this bulldozer warning approach :P
@guybedford I'm imagining this would only be a warning for the top level package. Seem reasonable?
@MylesBorins or we could just state that there was at least 1 warning and tell them to use something like NODE_DEBUG
if they want to see all of them.
@bmeck not sure I'm following. Could you rephrase?
@MylesBorins I imagine we could have:
$ node my-app
Warning: Encountered a package without a "mode" field in its package.json
use NODE_DEBUG=loader to get more details.
And only print the warning once per process, and when the flag is set:
$ NODE_DEBUG=loader node my-app
LOADER 12345: Encountered a package without a "mode" field in its package.json at file:///app/package.json
LOADER 12345: Encountered a package without a "mode" field in its package.json at file:///app/node_modules/foo/package.json
LOADER 12345: Encountered a package without a "mode" field in its package.json at file:///app/node_modules/bar/package.json
@bmeck so I was imagining it being a bit more descriptive
$ node my-app
Warning: Encountered a package without a "mode" field in its package.json. This currently defaults to the `cjs` mode but will default to the `esm` mode in the future
That copy is bad, and needs work... but that is what I wanted to be explicit about.
With all this being said I'm still having mixed feelings about the mode being on a per package level for external imports. I don't think that I have totally wrapped my head around all of the edge cases though, so I'm going to keep most of that opinion to myself until I've worked it out
Providing only the first warning could definitely make sense. I'd still be averse to implementing this if that warning could apply to a third party package without some type of greater upgrade path strategy (ideally with package manager buy-in though). Let me know if you disagree, but for now I'd say it falls out of the scope of this PR (until tangible upgrade path plans can be put in place, but they will definitely be possible).
I'm fine punting the warning mechanics to afterwards.
I'd personally like the single warning at a top level that defaults are subject to change in this PR. This is closer to a -0 than a -1, but I want to set expectations early
What is a package consumer supposed to do when they encounter this warning? What does the warning mean, in practical terms? What is it warning against?
What does the package author achieve by marking a directory as containing only module-parse-goal files or common-js-requireable files? What if the package author has, say, a bin dir with files intended to be used by older and newer nodes in a hybrid package? How would they silence the noise? Is an override per-file possible for hybrids?
I realize I'm new to this PR, but is there a design doc I could read outlining the overall vision & use cases we'd like to support? That would help me aim feedback usefully.
@ceejbot thanks so much for taking a look, as mentioned in previous comments this is something that works much better together with the package manager, so it would be really great to hear your feedback further. The original proposal for this is located at https://github.com/nodejs/node-eps/pull/60/files. This mode does not affect the way CommonJS loading works and won't stop require
from loading .js
files marked as esm, it is only for the es module resolver.
The warnings as discussed here would simply be to ensure the mode is explicit - "mode": "cjs"
is currently equivalent to not having a mode or even having an invalid mode property. Currently there are no warnings and I'm resistant to integrate any warnings until we have actionable steps for them. Directories marked as module files only affect the ES module resolver, CommonJS require doesn't change behaviour at all with this PR so dual packages work fine like in https://github.com/nodejs/node/pull/18392#issuecomment-363122412. In addition a package.json in a subdirectory can provide directory-level overrides.
The warnings are about getting to a place where the default mode can eventually (mid to long term) be switched to treating js files as ES modules. To prepare for this Node could warn for packages not containing a mode, with for example, "mode": "esm"
added during npm init workflows with a prompt of "Use ES Modules for js files?" or similar and "mode": "cjs"
automatically written into the package.json for installed packages when no mode is present. This is where package manager support comes in as I say. That might then allow us to eventually change the default as Myles suggests.
@MylesBorins I guess the top-level main is usually guaranteed to be a user package except in the case of running bin scripts, where the warning would then be completely unwanted. If there was a way we could at least distinguish those running cases it might be possible to add something sooner, but until all actionable steps are fully worked out I'd say we should wait.
This provides a
mode: "esm"
flag for package.json files which will treat ".js" files as ES modules within a given package boundary.The package boundary of a package.json file is deemed to be all modules in that folder and subfolder until the next nested package.json file.
The plan is to add a top-level
--mode
flag to this, roughly matching the npm proposal, but restricted to the current package boundary. This work is currently at https://github.com/guybedford/node/tree/package-top-mode.The mode effectively marks a directory as an "esm mode" directory, where all ".js" files in that directory are loaded as ES modules, unless they are located in a subfolder that contains a package.json without an "esm mode", which results in reverting to the transparent interop of treating ".js" as CommonJS.
In addition this PR enables caching of package.json lookups within the ES module resolver.
Presentation: https://docs.google.com/presentation/d/1xK1ana_TIxfAHX33CYVHFnJsV0If_YirLtRBBga9cv0/edit#slide=id.p
This work is based on the following prior proposal work:
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
esmodules