Open mcollina opened 6 months ago
cc @GeoffreyBooth @joyeecheung
It’s certainly much reduced since the initial commit where that warning was added; probably next to nothing for CommonJS files, as that’s what we’ve optimized for. I think it’s probably still a noticeable impact for ESM files that are ambiguous (like a .js
file using import
/export
where the package.json
has no type
field). Did you measure that case?
yes... 1ms on my machine.
Well that’s great news, I wasn’t expecting that. 1ms for what, a single file? Or an entire app, like if you had a thousand ESM files in a "type": "module"
scope and you removed the type
field?
I guess the question then is what are the downsides of not trying to discourage detection? If I remember correctly, the primary intent with detection was to cover two cases:
/usr/local/bin/my-node-script
where the user can’t reasonably put a package.json
file nearbyWith the thinking being that the performance hit of detection was outweighed in these cases by the UX improvement, of having the my-node-script
command work or having the beginner’s code work without first needing to learn about .mjs
or type
. But that aside from these niche cases, most apps could and should specify the type
field to avoid the performance penalty.
But if there is no performance hit, or a negligible one, then why wouldn’t everyone just rely on detection all the time? Why bother setting the type
field? I feel like there must be reasons that people would want to still do so, like compatibility with build or linting tools maybe; but I don’t know concrete examples. Maybe there aren’t any, or not any compelling ones, and we shouldn’t bother trying to encourage the type
field or .mjs
/.cjs
? It feels wrong even to suggest that, but off the top of my head I can’t think of the argument why not.
I would not push our luck too much, as 1ms for 1000 modules is 1s, and I've seen apps with that many.
I would likely not emit the warning for the top level/entrypoint, but still for everything inside node_modules
.
I would likely not emit the warning for the top level/entrypoint, but still for everything inside
node_modules
.
The problem there is that warning for node_modules
is really annoying, as the code in there is generally outside of the user’s control. Like what are they supposed to do to address the warning; patch the dependency? Ask its maintainer to add the type field? Choose something else?
We discussed warning on node_modules
early on and we were thinking that it wouldn’t matter so much for there because library authors would hear from users really fast if they published libraries that only work on Node versions with detection enabled by default. Eventually once detection has been enabled by default for years and the older versions of Node are EOL, that might be more of a reason to warn; though who knows what assumptions we should make by then. Maybe by 2027 we flip to attempting to parse as ESM first and fall back to CommonJS, if the ESM migration is finally over the hump by then, and so the only detection performance penalty is for CommonJS files relying on it.
Then not warning at all is actually ok.
I have run a benchmark on one of my "huge" project using ESM on Windows. Here are the results.
Node 20 with "type": "module"
Benchmark 1: node bin/server.js
Time (mean ± σ): 728.4 ms ± 10.6 ms [User: 87.5 ms, System: 42.2 ms]
Range (min … max): 713.6 ms … 744.4 ms 10 runs
Node 22 with "type": "module"
Benchmark 1: node bin/server.js
Time (mean ± σ): 739.5 ms ± 35.4 ms [User: 78.1 ms, System: 34.4 ms]
Range (min … max): 644.4 ms … 767.0 ms 10 runs
Node 22 with --experimental-detect-module
Benchmark 1: node --experimental-detect-module .\bin\server.js
Time (mean ± σ): 744.8 ms ± 16.2 ms [User: 118.8 ms, System: 37.5 ms]
Range (min … max): 728.8 ms … 785.5 ms 10 runs
I have run a benchmark on one of my “huge” project using ESM on Windows
How many ESM files is this, and how many lines of code?
I also don’t quite know what to make of these results. The Node 20 numbers are interesting but not really relevant; I’m comparing Node 22 with and without the flag. If I wanted to boil it down to “here’s how much slower the flag makes the app,” am I comparing “Time (mean)” so 739.5 ms to 744.8 ms, or 0.7% slower? Or is it more relevant to compare the “User” numbers, 78.1 to 118.8, so 34% slower?
Or is what’s really happening here that the extra CPU cost of detection is irrelevant because the async file I/O takes so much longer? As in, without the flag the CPU is more idle than with the flag, but in the end the process takes almost exactly the same amount of wallclock time because the bottleneck is the I/O and not the CPU?
The warning is defending against the case where there is a package.json, but it doesn’t contain “type”: “module” even though the .js modules inside are using ESM syntax. To hit the case that it is defending against, you will need to recursively remove all the “type” : “modules” entry in your package.jsons (assuming you have any, and that most of your dependencies use real ESM and count on this to work), and run it with the flag; then compare it with the case where this detection flag is unnecessary (when you still have the type field everywhere). you can’t compare it with a case where you still have package.json with type fields because without the flag that module graph would not load at all currently.
How many ESM files is this, and how many lines of code?
I am not sure how I can count the loaded file since I am basically starting one of my application and killing it once the http server start (so it is the initialization time of the application).
I could re-run it many times to see if the number differ and if the "User" number seems related to the removal of the type in the package.json
.
you will need to recursively remove all the “type” : “modules” entry in your package.jsons
Do you mean I should also remove it from all my dependencies to create a fair test?
@RomainLanz I just mean count the number of JavaScript files and lines of code in your project. So delete/move away node_modules
and then run commands to count the number of .js
files, and sum the lines of code. (Not sure what those commands would be for Windows but I’m sure you can Google it.)
The benchmarks that we want to compare are:
package.json
edited to remove the type
field, if it was there; and with all your ESM files using the .js
extension, if you had any with .mjs
Assuming you probably have only one package.json
for your project, that’s all you need to do. I wouldn’t modify anything in your node_modules
, as that’s not realistic that typical users would do. If you’re using Hyperfine, please include the summary lines at the end (“such and such is 2x faster than such and such”).
Do you mean I should also remove it from all my dependencies to create a fair test?
I would say yes because removing the warning effectively encourages packages to publish without adding a type field in their package.json and that’s what we don’t want. The warning here would show up in their CI to prevent packages from accidentally publish without the type field. Otherwise it could very well be interpreted as “it’s fine to publish ESM in .js without a type field now!” And the ecosystem start to get slower as it becomes the norm and Node.js commonly has to second guess the module type.
Usually, the community is pretty good at correcting those things, so I don't expect it to be a problem.
I don't think the ecosystem is going to start relying on automatic detection. This will break TypeScript, ESLint, etc.
If we are confident that the ecosystem will add the type field, then those who do won’t be hitting the warning anyway. Either enough people will rely on it so we should prevent it from being a norm, or not enough people will rely on it then only very few people get to be bothered by this warning and it’s harmless to keep.
Either enough people will rely on it so we should prevent it from being a norm, or not enough people will rely on it then only very few people get to be bothered by this warning and it’s harmless to keep.
Well, my main concern is developers who do npm init
for a new app, and I would prefer they not see a warning for a few millis of delay. The DX benefits are not worth the warning.
And not to sound like a heretic on performance, but do we generally warn on performance concerns? Like there are all sorts of slow patterns that we could warn users about—Warning: This .forEach would be faster as for ... of
—but I feel like in general we leave it as the user’s responsibility to write performant code. We try to nudge them in the performant direction, for sure, by making the performant approach the easiest or default option when possible; but if something works, and isn’t unsafe, is that worth warning about?
(Aside from experimental warnings that serve the purpose of warning about potential breaking changes; we definitely still want those warnings.)
I just mean count the number of JavaScript files and lines of code in your project. So delete/move away node_modules
That would be highly complicated to count. As said, it is a real world project that depends a lot on third party libraries (coming from the node_modules
). It is an AdonisJS application with many addons/providers that is "slow" to start in either way (with or without the type: "module"
in the package.json). I can provide some examples if you want to run your own benchmark.
I shared the numbers about to show that there are no real difference with and without the flag. Meaning I believe we can remove the warning.
Well, my main concern is developers who do
npm init
for a new app, and I would prefer they not see a warning for a few millis of delay. The DX benefits are not worth the warning.
Will the warning stay when the feature become unflagged?
That would be highly complicated to count. As said, it is a real world project that depends a lot on third party libraries (coming from the
node_modules
).
If you can run Bash, you can count them via:
find . -type f -name "*.js" -not -path "*/node_modules/*" | wc -l
And count lines of code via:
find . -type f -name "*.js" -not -path "*/node_modules/*" | xargs wc -l
I’m excluding node_modules
because I don’t expect users to be using detection for their dependencies.
Will the warning stay when the feature become unflagged?
The warning about performance would presumably stay, yes, because it doesn’t relate to the feature’s experimental status.
Well, my main concern is developers who do npm init for a new app, and I would prefer they not see a warning for a few millis of delay. The DX benefits are not worth the warning.
Shouldn't the proper solution to that problem be including "type" in npm init
?
Shouldn't the proper solution to that problem be including "type" in npm init?
https://github.com/nodejs/TSC/issues/1445#issuecomment-1742118965
npm is considering adding the type field to npm init: https://github.com/npm/init-package-json/pull/302
(Although it seems the general stance from npm is - people shouldn't really be using npm init
for anything other than basic apps - for real apps they should use the create-
packages, which should configure the type field properly )
I did some local benchmarking out of curiosity - currently loading a ESM graph with detection is about 45% slower than loading the graph without detection (by just adding "type": "module"
to pacakge.json) https://github.com/nodejs/node/pull/55238
Running the fixtures from the benchmarks locally, the numbers look like this:
$ hyperfine "../node/out/Release/node load.js"
Benchmark 1: ../node/out/Release/node load.js
Time (mean ± σ): 1.799 s ± 0.047 s [User: 1.827 s, System: 0.636 s]
Range (min … max): 1.738 s … 1.875 s 10 runs
$ echo '{ "type": "module" }' > package.json
$ hyperfine "../node/out/Release/node load.js"
Benchmark 1: ../node/out/Release/node load.js
Time (mean ± σ): 1.135 s ± 0.027 s [User: 1.315 s, System: 0.491 s]
Range (min … max): 1.088 s … 1.195 s 10 runs
@joyeecheung good work. It seems the overhead get worse the more files it adds.
Should I close this then?
In addition to performance hit, I think the warning can also be also updated a bit to remind about the possibility of misinterpretation and be more explicit about what's causing the ESM interpretation. e.g. if a beginner attempts to throw a top-level await into an existing CommonJS file:
const fs = require('fs');
await fs.promises.readFile('./README.md');
Previously you get
await fs.promises.readFile('./README.md');
^^^^^
SyntaxError: await is only valid in async functions and the top level bodies of modules
Now you get
const fs = require('fs');
^
ReferenceError: require is not defined in ES module scope, you can use import instead
At least, the second hint is confusing/not as practical when they are already modifying a CommonJS codebase (and today it's still very common for CommonJS code bases to have package.json without "type", so it's easier for them to fall into this, whereas ESM code bases that uses .js
have been doing "type": "module"
anyway so everything would've been clearer), and it would useful for them to learn why require
is now suddently unavailable in this file. This also applies to other less-obvious syntaxes that can trigger the ESM interpretation (e.g. const module = ..
).
I did some experiments with
--experimental-detect-module
and I've found that the added cost of "typeless"package.json
is almost negligible, around 1ms on my system.I recommend we remove the warning.