toss / es-toolkit

A modern JavaScript utility library that's 2-3 times faster and up to 97% smaller—a major upgrade to lodash.
https://es-toolkit.slash.page
Other
6.16k stars 252 forks source link

Dist contains bare imports of empty chunks in ESM output #216

Closed fvsch closed 1 month ago

fvsch commented 1 month ago

In the dist folder produced by yarn build, a number of .mjs files contain bare imports for chunks which:

  1. have some exports, but those exported names are not used by the bare imports.
  2. Or contain no code at all (only a reference to an empty source map), for example:

The dist/index.mjs module contains 11 bare imports, looking like this:

import "./chunk-NKLIE2OI.mjs";
import "./chunk-QKYYOLJ6.mjs";
// …

And there are 93 other dist/**/*.mjs files containing 148 bare imports. For instance, dist/string/camelCase.mjs looks like this:

import {
  camelCase
} from "../chunk-Y5V7UAZZ.mjs";
import "../chunk-U56ZOM7L.mjs";
import "../chunk-WVQ7YVRK.mjs";
import "../chunk-2MM5EJJX.mjs";
export {
  camelCase
};
//# sourceMappingURL=camelCase.mjs.map

I wonder if those bare imports for empty files or unused exports affect the performance of the package. In Node it would add unnecessary disk i/o, and in the browser it would add unnecessary HTTP requests (e.g. if loading a function from es-toolkit from a CDN).

I suspect this was introduced by the move to tsup in https://github.com/toss/es-toolkit/pull/21, and I can see the issue in the published package all the way back to 1.1.0.

It looks like there are a few issues in the https://github.com/egoist/tsup repository that might match this problem, but they haven't seen any activity. It looks like it might be a problem with multiple entries. Here are a couple issues: https://github.com/egoist/tsup/issues/730 https://github.com/egoist/tsup/issues/841

raon0211 commented 1 month ago

Seems that this looks like an issue for me, too. We might try vite, rollup, or other bundlers.

fvsch commented 1 month ago

I looked into this a bit more, and the issue seems to be in esbuild.

Combining the options { bundle: true, splitting: true } (where the splitting option only applies to ESM builds) results in things like:

I also tried a bunch of different configurations for tsup and esbuild, to see if there are improvements that can be done with the current tooling.

Current state of the dist output

The current state, beyond the empty chunks, needless bare imports, and less-than-ideal location of function implementation code (into dist/chunk-{hash}.mjs instead of dist/{category}/{functionName}.mjs), looks like:

  1. For CJS, we get single-file bundles:

    • dist/index.js contains the whole library;
    • dist/{category}/index.js contains all the functions for that category (e.g. all the array functions in dist/array/index.js), duplicating what's in dist/index.js (but that does enable require('es-toolkit/{category}'));
    • dist/{category}/{functionName}.js is a CommonJS bundle for that specific function (and the functions it imports, if any, see dist/string/camelCase.js for an example).
  2. For ESM, we get an output that is closer to the original source:

    • The different src/**/index.ts files result in corresponding dist/**/index.mjs files with similar imports/exports.
    • We also have dist/{category}/{functionName}.mjs files for each function.
    • But all the actual code for functions (and the few helpers some of them use) get extracted into dist/chunk-{hash}.mjs files.
  3. For TypeScript declarations, we get two sets of declaration files, one for the CommonJS build and one for the ESM build. In my tests they look completely identical, they just have different extensions. There might be an opportunity to keep just one set.

Possible changes

I think it would be good to specify what kind of result we want in dist exactly, and see if the current tooling can achieve it, or if a change might be needed.

For CommonJS, the bundling approach might be okay, especially if the CommonJS build is meant as a fallback and the expectation is that modern bundlers would prefer the ESM build. It might be possible to remove duplication and cruft for CommonJS with a few tweaks:

// dist/index.js (CommonJS entrypoint)
module.exports = {
  ...require('./array/index.js'),
  ...require('./array/index.js'),
  // …
};

For ESM, with tsup/esbuild it looks like you can get a cleaner output by disabling bundling and splitting:

// tsup.config.ts
export default defineConfig({
  format: 'esm',
  entry: ['src/**/*.ts', '!**/*.{spec,test,test-d}.*'],
  bundle: false,
  splitting: false,
});

Basically, that's transpiling each TS file individually.

The issue with this approach is that without bundling, the extensions in import specifiers don't get rewritten, so you end up with something like:

// dist/object/index.mjs
import { omit } from "./omit.ts";
import { omitBy } from "./omitBy.ts";
import { pick } from "./pick.ts";
import { pickBy } from "./pickBy.ts";
import { invert } from "./invert.ts";
import { clone } from "./clone.ts";

This is broken, because we'd need .mjs extensions here. It's a known issue with ESBuild, see for instance:

https://github.com/evanw/esbuild/issues/1505 https://github.com/evanw/esbuild/issues/2435

It's also an issue with TypeScript, which decided it would allow extensions in import specifiers in 5.0, but also decided it would not rewrite those extensions (to .js, .mjs or .cjs) when compiling. ESBuild is only following what TypeScript does here.

TypeScript's "solution" to this problem is that you can author your TypeScript files with imports that look like this:

// src/object/index.ts
export { omit } from './omit.js';
export { omitBy } from './omitBy.js';
export { pick } from './pick.js';
export { pickBy } from './pickBy.js';
export { invert } from './invert.js';
export { clone } from './clone.js';

and despite your import specifiers saying .js, it will import the corresponding .ts files.

With that pattern, the ESM output from tsup/ESBuild will have .js extensions in import specifiers instead of .ts.

But to make use of that solution in a Node.js package, you basically need to set that package to "type": "module" instead of the default "type": "commonjs", and I don't know if that's doable for this package.

Keeping tsup as the builder

My impressions so far after close to a day of testing:

Alternatives to tsup

I think this depends on the design of the desired output in dist.

raon0211 commented 1 month ago

Let me experiment with various bundlers and see how it goes.

fvsch commented 1 month ago

A possible approach in 2 builds:

  1. First build using tsc to generate ESM and declarations (.js and .d.ts), no bundling.
  2. Second build using Rollup and @rollup/plugin-typescript to generate a few CommonJS bundles (.cjs).
fvsch commented 1 month ago

I’m experimenting a bit and getting good results with Rollup. I'll share a branch when I have something that looks decent.

fvsch commented 1 month ago

I created #255 as a draft PR to show my current progress with Rollup.

Can be tested with:

rm -rf dist
yarn rollup -c rollup.config.mjs

and then inspecting the contents of the dist folder.

raon0211 commented 1 month ago

Thanks! Let's see how it goes.