evanw / esbuild

An extremely fast bundler for the web
https://esbuild.github.io/
MIT License
37.83k stars 1.12k forks source link

Two versions of the same module (cjs and esm) end up in the bundle #1950

Closed ggarek closed 2 years ago

ggarek commented 2 years ago

Hello and thank you for writing such a great tool!

I found an issue while migrating to esbuild, and thought I'd share. It may be related to some other reports around module imports.

The issue

I've come across an issue using esbuild when two versions (cjs and esm) of the same module(s) end up in the final bundle.

Building the same codebase with webpack doesn't seem to have this issue, though I am not sure how webpack addresses it. I didn't look into this.

An example

Let's say we have this import:

import { ListItem } from '@material-ui/core';

This leads to bundling @material-ui/core/esm/ButtonBase/TouchRipple.js among other modules.

If now we add another particular import:

import { ListItem } from '@material-ui/core';
import { Picker } from '@material-ui/pickers/Picker/Picker';

we end up with two versions of the same module in the bundle:

1) @material-ui/core/esm/ButtonBase/TouchRipple.js 2) @material-ui/core/ButtonBase/TouchRipple.js

The @material-ui/picker package keeps its esm modules under esm/ dir, so we do indeed ask for the cjs version in the second import. Just looking at the import statement, it is not easy to notice, though.

We could fix it...

We can fix it by changing the second import:

import { ListItem } from '@material-ui/core';
// import { Picker } from '@material-ui/pickers/Picker/Picker';
import { Picker } from '@material-ui/pickers/esm/Picker';

Now we have just the @material-ui/core/esm/ButtonBase/TouchRipple.js in the bundle.

It is quite easy to make this kind of mistake. It would be nice if we could help developer avoid making it. I am not sure how exactly esbuild can be helpful here, but I think it is possible to detect the possible occurrence of this problem and and warn the developer.

Going around a package entry point

The @material-ui/picker package defines both entrypoints, "main" (for cjs) and "module" (for esm), but going through the entry point is not the only way to import a package. In this case we refer to one of the modules of the package directly, effectively going around the entry point. To write this import correctly developer has to know how a particular package structures its modules and which version they want to import. That sounds fragile.

What could we do?

These are some of the ideas what we could do about it

1) We could build a graph of package modules for each entry point, check if the graphs share any nodes, and if they don't we could spit out a warning or fail the build if nodes from the both graphs end up in the bundle. 2) We could list path prefixes for the modules that are allowed (That's dirty, because it is brittle and manual)

If you have any ideas or opinions in this regard, I would appreciate if you can share.

Thanks for help!

p.s.: I will share a simple repro later.

evanw commented 2 years ago

You can see the exact import path with --analyze=verbose. Documentation is here: https://esbuild.github.io/api/#analyze. It looks something like this (simplified to just the files you mentioned):

  index.js ──────────────────────────────────────────────────────────────────────────────────── 1.6mb ─ 100.0%
   ├ node_modules/@material-ui/core/ButtonBase/TouchRipple.js ───────────────────────────────── 9.6kb ─── 0.6%
   │  └ node_modules/@material-ui/core/ButtonBase/ButtonBase.js
   │     └ node_modules/@material-ui/core/ButtonBase/index.js
   │        └ node_modules/@material-ui/core/IconButton/IconButton.js
   │           └ node_modules/@material-ui/core/IconButton/index.js
   │              └ node_modules/@material-ui/pickers/views/Calendar/CalendarHeader.js
   │                 └ node_modules/@material-ui/pickers/views/Calendar/Calendar.js
   │                    └ node_modules/@material-ui/pickers/Picker/Picker.js
   │                       └ index.js
   ├ node_modules/@material-ui/core/esm/ButtonBase/TouchRipple.js ───────────────────────────── 7.8kb ─── 0.5%
   │  └ node_modules/@material-ui/core/esm/ButtonBase/ButtonBase.js
   │     └ node_modules/@material-ui/core/esm/ButtonBase/index.js
   │        └ node_modules/@material-ui/core/esm/index.js
   │           └ index.js
   └ index.js ───────────────────────────────────────────────────────────────────────────────── 104b ──── 0.0%

Here you can see the that problem is when node_modules/@material-ui/pickers/views/Calendar/CalendarHeader.js imports node_modules/@material-ui/core/IconButton/index.js instead of node_modules/@material-ui/core/esm/IconButton/index.js. You can use esbuild's --log-level=verbose flag to figure out why. It generates a lot of output. Here's the relevant part:

⬥ [VERBOSE] Resolving import "@material-ui/core/IconButton" in directory "./node_modules/@material-ui/pickers/views/Calendar" of type "require-call"

  Read 10 entries for directory "./node_modules/@material-ui/pickers/views/Calendar"
  No "browser" map found in directory "./node_modules/@material-ui/pickers/views/Calendar"
  Searching for "@material-ui/core/IconButton" in "node_modules" directories starting from "./node_modules/@material-ui/pickers/views/Calendar"
    Parsed package name "@material-ui/core" and package subpath "./IconButton"
    Checking for a package in the directory "./node_modules/@material-ui/core/IconButton"
    Read 142 entries for directory "./node_modules/@material-ui/core"
    No "browser" map found in directory "./node_modules/@material-ui/core"
    Attempting to load "./node_modules/@material-ui/core/IconButton" as a file
      Checking for file "IconButton"
      Checking for file "IconButton.tsx"
      Checking for file "IconButton.ts"
      Checking for file "IconButton.jsx"
      Checking for file "IconButton.js"
      Checking for file "IconButton.css"
      Checking for file "IconButton.json"
      Failed to find file "IconButton"
    Attempting to load "./node_modules/@material-ui/core/IconButton" as a directory
      Read 142 entries for directory "./node_modules/@material-ui/core"
      The file "./node_modules/@material-ui/core/IconButton/package.json" exists
      Read 5 entries for directory "./node_modules/@material-ui/core/IconButton"
      Searching for main fields in "./node_modules/@material-ui/core/IconButton/package.json"
        Did not find main field "browser"
        Found main field "module" with path "../esm/IconButton/index.js"
          No "browser" map found in directory "./node_modules/@material-ui/core/IconButton"
          Attempting to load "./node_modules/@material-ui/core/esm/IconButton/index.js" as a file
            Checking for file "index.js"
            Found file "index.js"
        No "browser" map found in directory "./node_modules/@material-ui/core/IconButton"
        Failed to find file "./node_modules/@material-ui/core/IconButton/index.tsx"
        Failed to find file "./node_modules/@material-ui/core/IconButton/index.ts"
        Failed to find file "./node_modules/@material-ui/core/IconButton/index.jsx"
        Found file "./node_modules/@material-ui/core/IconButton/index.js"
        Resolved to "./node_modules/@material-ui/core/IconButton/index.js" because of "require"
  Read 5 entries for directory "./node_modules/@material-ui/core/IconButton"
  Marking this file as having no side effects due to "./node_modules/@material-ui/core/IconButton/package.json"
  Primary path is "./node_modules/@material-ui/core/IconButton/index.js" in namespace "file"

This line is telling: Resolved to "./node_modules/@material-ui/core/IconButton/index.js" because of "require". This is the result of some heuristics in esbuild to try to do the thing that's most likely to work in packages with both CommonJS and ESM. When selecting between node_modules/@material-ui/core/IconButton/index.js instead of node_modules/@material-ui/core/esm/IconButton/index.js, esbuild goes with the CommonJS one because it's imported using require().

Typically people expect require() to always match CommonJS instead of ESM, and some libraries actually break when require() resolves to an ES module. You can see the issue where this heuristic originated for lots more context: #363 (the problem was with the node-fetch package). This heuristic is also in the documentation here: https://esbuild.github.io/api/#platform.

While esbuild's heuristics are the best for maximum compatibility, you may be able to turn them off if none of the packages you are using are affected by the issue. In that case you can configure --main-fields=module,main to force esbuild to always prefer module before main even when the path is imported by require(). That brings the module size for your example down from 1.6mb to 1.4mb. Keep in mind that this can break certain libraries that need require() to resolve to CommonJS. I'm guessing always preferring module before main is what Webpack is doing by default anyway because I don't think Webpack has this heuristic.

Another way to solve your issue on the importing end is to only ever import from the package entry point, not from nested paths. It looks like the package entry points for Material UI always re-export all nested paths. For example, you can replace your import with import { Picker } from '@material-ui/pickers' and the bundle size shrinks from 1.6mb to 1.3mb. So this is even smaller than the above --main-fields=module,main approach (since you are only bundling ESM and are no longer bundling CommonJS). You could potentially create a plugin or write a lint rule to enforce this.

Along the lines of being ESM-only, you could also create a plugin for esbuild or a lint rule for your linter that just forbids non-ESM imports within the @material-ui scope. This isn't something that esbuild does because it doesn't know what path conventions the package is using, but you can special-case that for the packages you care about. You mentioned the overhead of writing down all of the path prefixes but it could be something simpler like "only allow top-level imports or sub-paths with /esm/ in them" if the structure of that package allows for it.

The Material UI package authors could potentially fix this themselves by changing their package.json files to use node's new exports field. More documentation is here: https://nodejs.org/api/packages.html#package-entry-points. This was designed for dual CommonJS/ESM packages like this. It lets you have a package where you only have one path for both CommonJS and ESM, say @material-ui/pickers/Picker, and whether it resolves to CommonJS or to ESM depends on whether it was imported using import or require. There are more details about how to do that here: https://nodejs.org/api/packages.html#dual-commonjses-module-packages. Specifically if you're worried about people using a mix of import and require to interact with your package, you would have the ES modules re-export the CommonJS modules. Then you won't ever end up with two copies of the same code in your bundle. But doing this is a big breaking change for a package and would come at a regression in code size for the optimal case, so I can understand why packages don't do it.

TL;DR: This situation is an unfortunate mess but there are a few different things to try. Using --main-fields=module,main will avoid the duplication in this case. But enforcing ESM-only imports is strictly better in terms of code size because it avoids CommonJS entirely.

kzc commented 2 years ago

This situation is an unfortunate mess but there are a few different things to try. Using --main-fields=module,main will avoid the duplication in this case.

There could be other cases of this in the JS ecosystem that the library authors are unaware of. It would be quite helpful if esbuild were to emit a warning if both CommonJS and ESM versions of the same module are bundled with the suggestion listed above.

Keep in mind that this can break certain libraries that need require() to resolve to CommonJS.

You get a different sort of problem with module-specific static variables if both cjs and esm versions of the same module are bundled - similar to a local version of split-brain. So it can be a correctness issue as well as a bundle size issue.

evanw commented 2 years ago

emit a warning if both CommonJS and ESM versions of the same module are bundled

One problem is that "same module" isn't a precise thing. That's something that you only know for sure if you read and understand that package's build scripts. Packages could use different naming schemes for this (e.g. foo.js and esm/foo.js, foo.cjs and foo.mjs, foo.js and foo.esm.js). How should esbuild know which ones are which? It could emit a warning any time you import two files with the same filename prefix from the same package. But that could cause annoying false positives if the files are unrelated (e.g. util.js).

Another problem is that this warning is likely to be really confusing. The community has come up with a lot of creative ways of changing how path resolution happens and both packages and esbuild do lots of things to try to make it work. The way to adjust path resolution to be the way you want may be very non-intuitive and might not be something that esbuild can automatically determine for you. As you can see above, the explanation of this situation was very nuanced and different options have different trade-offs. Also some options are only useful if they don't happen to break other packages that you're using, which isn't something esbuild can know about.

Also this warning is likely to be really annoying, especially because you likely won't be able to do anything about it if the problem happens to be due to another package that's not yours. I expect that I'd get a lot of complaints about something like this immediately if I were to add it.

Edit: Another thing is that node's specification of exports actually requires that multiple copies of a module should be bundled in some situations: https://nodejs.org/api/packages.html#dual-commonjses-module-packages. The way to avoid duplication is supposed to be for one file to import the other. But then it's fine if both of them end up in the bundle because it's supposed to happen by design. So the warning would be a false positive there too.

This seems like something that's case-specific to me. So it's probably best for whoever cares about this to write their own code for whatever constraints they want to enforce, likely using esbuild's metafile feature for information about the build.

You get a different sort of problem with module-specific static variables if both cjs and esm versions of the same module are bundled - similar to a local version of split-brain. So it can be a correctness issue as well as a bundle size issue.

Yup. There's no solution that works well for everything unfortunately. One way to try to solve this is --main-fields=main. But of course then you lose tree-shaking.

kzc commented 2 years ago

Also this warning is likely to be really annoying, especially because you likely won't be able to do anything about it if the problem happens to be due to another package that's not yours. I expect that I'd get a lot of complaints about something like this immediately if I were to add it.

I can see that. What about emitting the "warning" in --log-level=info?

One problem is that "same module" isn't a precise thing.

Perhaps only "warn" for modules without a leading dot or slash in node_modules with no deep linking - i.e., require("foo") and import "foo" are both encountered in a build - and possibly import("foo"). The original module name, import/require type and final output format would have to be tracked.

Hmm. It could get complicated quickly. But it would be useful.

evanw commented 2 years ago

require("foo") and import "foo" are both encountered in a build

This case is already handled by esbuild (assuming the package is using module/main instead of exports). In this case esbuild only bundles the CommonJS version. The ESM version is not bundled because the import "foo" is deliberately redirected by esbuild to the CommonJS version due to the require("foo") to avoid duplication. So there would be nothing to warn about here. This case isn't what happened in this issue.

If the package is using exports instead then esbuild will bundle both the CommonJS and ESM versions. But that's the required behavior from the specification so that's supposed to happen, and doesn't seem appropriate for a warning. This case also isn't what happened in this issue.

evanw commented 2 years ago

Closing this issue because esbuild is working as intended. This is a common problem with bundlers and dual-format packages and is the unfortunate result of a set of conventions that have been created and adopted by the community. The solution is to work around it on the package use side and/or the package authoring side as described above.

ggarek commented 2 years ago

Thank you, @evanw, for very prompt and elaborate response. Your answer gave me plenty of insights into esbuild and the problem domain. I really appreciate the effort and the quality of writing 😃

It is great that you joined, @kzc. Your questions added value to the discussion. Thank you.