[Feature] Code splitting on async import() statements.

tracker1 commented 4 years ago

Support code splitting on dynamic import() statements, and additionally split/join on shared bundles for shared dependency models.

evanw commented 4 years ago

This is definitely something I plan to get to because I want to be able to use it myself. Right now import(path) turns into Promise.resolve().then(() => require(path)) so dynamic imports still "work" although they don't result in additional bundles. In the future it will generate separate bundles. I may also add support for common chunk and/or more advanced shared dependency analysis.

jpmaga commented 4 years ago

@evanw do you have any kind of roadmap somewhere for esbuild? I am particularly interested in this feature, and would be cool to know where it is in terms of planning. Cheers.

andrewvarga commented 4 years ago

This would be awesome to have!

evanw commented 4 years ago

I don't have a specific date but I'm currently focused on a rewrite of the bundler to enable code splitting, tree shaking, ES6 module export, and a few other features. I have to do these together because they are all interrelated.

I've done the R&D prototype to prove it out and I've settled on an approach. I'm currently working on doing the rewrite for real on a local branch. There's still a lot left to do to not break features I've added in the meantime (stdin/stdout support, transform API, etc) so it'll take a while. I have a lot of test failures to work through :)

I was worried about the performance hit because the graph analysis algorithms inherently reduce parallelism, but some early performance measurements seem to indicate that it won't slow it down that much, if any. I hope to ship this sometime in the next few weeks. We'll see how it goes!

jpmaga commented 4 years ago

I don't have a specific date but I'm currently focused on a rewrite of the bundler to enable code splitting, tree shaking, ES6 module export, and a few other features. I have to do these together because they are all interrelated.

I've done the R&D prototype to prove it out and I've settled on an approach. I'm currently working on doing the rewrite for real on a local branch. There's still a lot left to do to not break features I've added in the meantime (stdin/stdout support, transform API, etc) so it'll take a while. I have a lot of test failures to work through :)

I was worried about the performance hit because the graph analysis algorithms inherently reduce parallelism, but some early performance measurements seem to indicate that it won't slow it down that much, if any. I hope to ship this sometime in the next few weeks. We'll see how it goes!

Damn! You're the man. This is the only thing I am missing to start using it in production, in smaller projects for starters, and see how it goes. PS: Have tested a couple locally, without code splitting, and everything worked flawlessly, even in one with a fairly large codebase using react and typescript. 👍

ponsifiax commented 4 years ago

Hello here, Do you have any news about this? That the last feature to use it on production :+1:

evanw commented 4 years ago

Do you have any news about this?

It's mostly working already. The chunk splitting analysis has already landed. All that's left is to bind imports and exports across chunks. I'm working on that in a branch and this will be my main focus soon.

evanw commented 4 years ago

I just released version 0.5.15 with an experimental version of code splitting. See the release notes for details. It's still a work in progress but it's far enough along now that it's ready for feedback. Please try it out and let me know what you think.

garygreen commented 4 years ago

Excellent news! Thank you for all your hard work on this Evan. Code splitting was vital for us. Does this code splitting feature split css imports into seperate files and add at runtime? Simple CSS support is the next main thing we are eaglerly looking forward to.

evanw commented 4 years ago

Simple CSS support is the next main thing we are eaglerly looking forward to.

You and me both! CSS support is currently the next major feature I want to implement after code splitting. That’s tracked by a separate issue, however: #20.

matthiasg commented 4 years ago

Works really well in initial testing. We will test more complicated setups (rush repo, nested pnpm deps) more fully in the next weeks

evanw commented 4 years ago

That's great to hear! Thanks so much for trying it out.

evanw commented 4 years ago

I have a small progress update on code splitting. From the release notes for the upcoming release (not out yet):

Code that is shared between multiple entry points is separated out into "chunk" files when code splitting is enabled. These files are named chunk.HASH.js where HASH is a string of characters derived from a hash (e.g. chunk.iJkFSV6U.js).

Previously the hash was computed from the paths of all entry points which needed that chunk. This was done because it was a simple way to ensure that each chunk was unique, since each chunk represents shared code from a unique set of entry points. But it meant that changing the contents of the chunk did not cause the chunk name to change.

Now the hash is computed from the contents of the chunk file instead. This better aligns esbuild with the behavior of other bundlers. If changing the contents of the file always causes the name to change, you can serve these files with a very large max-age so the browser knows to never re-request them from your server if they are already cached.

Note that the names of entry points do not currently contain a hash, so this optimization does not apply to entry points. Do not serve entry point files with a very large max-age or the browser may not re-request them even when they are updated. Including a hash in the names of entry point files has not been done in this release because that would be a breaking change. This release is an intermediate step to a state where all output file names contain content hashes.

The reason why this hasn't been done before now is because this change makes chunk generation more complex. Generating the contents of a chunk involves generating import statements for the other chunks which that chunk depends on. However, if chunk names now include a content hash, chunk generation must wait until the dependency chunks have finished. This more complex behavior has now been implemented.

Care was taken to still parallelize as much as possible despite parts of the code having to block. Each input file in a chunk is still printed to a string fully in parallel. Waiting was only introduced in the chunk assembly stage where input file strings are joined together. In practice, this change doesn't appear to have slowed down esbuild by a noticeable amount.

matthiasg commented 4 years ago

@evanw Thanks a lot for this detailed write-up ! This is the kind of information required for using a tool such as this.

evanw commented 4 years ago

Another code splitting update:

I finally got around to implementing per-chunk symbol renaming, which I view as required for the code splitting feature. I've made several attempts at this in the past but I haven't landed them because I don't want to severely regress performance (or memory usage, which I've started to also pay attention to). I finally figured out a good algorithm for doing per-chunk symbol renaming that's fast and parallelizable while not using too much memory. It's actually two algorithms, one when minifying and a different one when not minifying.

From the release notes:

Previously, bundling with code splitting assigned minified names using a single frequency distribution calculated across all chunks. This meant that typical code changes in one chunk would often cause the contents of all chunks to change, which negated some of the benefits of the browser cache.

Now symbol renaming (both minified and not minified) is done separately per chunk. It was challenging to implement this without making esbuild a lot slower and causing it to use a lot more memory. Symbol renaming has been mostly rewritten to accomplish this and appears to actually usually use a little less memory and run a bit faster than before, even for code splitting builds that generate a lot of chunks. In addition, minified chunks are now slightly smaller because a given minified name can now be reused by multiple chunks.

guybedford commented 4 years ago

@evanw it would be very interesting if you could expand somewhere on the exact symbol naming technique you converged on here. I'm sure it will make sense looking at the outputs too though of course.

evanw commented 4 years ago

@evanw it would be very interesting if you could expand somewhere on the exact symbol naming technique you converged on here.

I just wrote up some documentation about the parallel symbol minification algorithm here.

The non-minified symbol renaming algorithm isn't described in the docs yet but it's pretty simple. Just rename symbols to avoid collisions by appending an increasing number to the name until there's no longer a collision. Each symbol will need to check for collisions in all parent scopes. Symbols in top-level scopes must be renamed in serial but symbols in nested scopes can be renamed in parallel.

mtsewrs commented 3 years ago

@evanw Do you plan on supporting code splitting with other formats apart from esm?

evanw commented 3 years ago

@evanw Do you plan on supporting code splitting with other formats apart from esm?

Yes, that's why this issue is still open. However I want to fix issues with the current esm code splitting first: #399.

DanielHeath commented 3 years ago

If the file contents are included in the hash, does that imply that circular references cannot be built (since each file contains a reference to another)?

Or is the hash calculated before rewriting the imported filenames?

evanw commented 3 years ago

does that imply that circular references cannot be built (since each file contains a reference to another)?

Yes, code splitting currently generates an acyclic module graph.

The current automatic code splitting algorithm makes sure that a) a given piece of code only ever lives in one chunk and b) a given entry point doesn't import any code that it won't use. This means it generates one chunk for each unique overlap of entry points. So if there are three entry points A, B, and C, that means there could potentially be up to 7 chunks: A, B C, A+B, A+C, B+C, and A+B+C. The chunk for A would only include code accessible by A but not by B or by C, the chunk A+B includes all code accessible by A and B but not by C, and the chunk A+B+C is for all code that is used by all entry points. Because of this structure, cyclic imports are not ever generated by construction. Two chunks wouldn't ever need to import each other because if they do reference each other, they would be considered a connected component in the graph and would have been written out as part of the same chunk.

This automatic algorithm was a good experiment but it has some drawbacks. The main drawback is just that it's automatic. Many people want to have control over the algorithm in various ways. With many entry points, I'm sure you can see how the current algorithm can potentially generate a lot of chunks due to the combinatorial explosion of overlaps. People familiar with ESM have said that this is fine since the browser can handle a lot of chunk files (>100). Other people are turned off by the idea of having lots of generated chunks and have been requesting manual control over chunk files. Potentially people are just more used to fewer chunks from Webpack setups with manual chunk generation and lack of HTTP/2. I'm not sure what to think about the trade-offs between these approaches because I haven't done extensive performance analysis myself.

To implement manual chunk assignment you would two things:

You would need the ability to include code in the bundle that's guaranteed to never be used.

For example, people may want to direct esbuild to turn a whole library into a single chunk even though not all of that library is used by all entry points. This will result in dead code. Right now this is impossible because esbuild's tree shaking algorithm automatically removes dead code. I'm currently designing a different linking model that will allow for keeping dead code while still keeping most of the benefits of ESM's static binding. It involves making module execution lazily-evaluated while still keeping module binding eagerly-evaluated. I'm not sure if this approach will work out but it seems hopeful.
You would need the ability for chunks to potentially participate in an import cycle.

Manual chunk assignment means esbuild can't generate an acyclic graph since code in a connected component may have multiple different manual chunk labels. I can think of two ways of linking cyclic chunks together. One way is to use dummy text for import paths, calculate all of the file hashes, then swap the dummy text for the real import paths. The file hashes will be "wrong" in that they won't be a hash of the ultimate file contents, but presumably it'd still be ok for cache invalidation as long as you mix in the hashes of all files involved in a cycle with each other. The other way is to pull out the hashes into an import map. That adds a level of indirection between the import paths and the actual hashed file names. It can lead to better caching because changing a dependency doesn't involve also changing the dependents, but import maps aren't a part of the web platform yet so this approach is presumably not viable for a while.

That's where my thinking is at the moment. I'm currently in the design phase for the next version of code splitting. The next iteration should hopefully finish the code splitting feature. I want to address the current known import ordering bug, get code splitting working for the cjs and iife formats, and potentially also implement manual chunk assignment. And it'd be really great to do top-level await too, although I may punt on that.

Edit: part of why I'm posting this is that I'm curious what people think about the path embedding approach vs. the import map approach.

DanielHeath commented 3 years ago

One way is to use dummy text for import paths, calculate all of the file hashes, then swap the dummy text for the real import paths

I think that makes the most sense, though the dummy text needs to be somehow derived from the path so that importing a different path generates a new fingerprint.

So if there are three entry points A, B, and C, that means there could potentially be up to 7 chunks: A, B C, A+B, A+C, B+C, and A+B+C

More confusing yet - if you have loaded chunk A, then navigate to an area that needs B, you could reasonably want B - A in order to avoid re-fetching libraries used by both.

The chunk-splitting API I would like to use looks something like:

(path: string, suggestedLocations: Array<string>): Array<string>

For instance, if node_modules/react/index.js were passed as the path you could return ["react"] to indicate that esbuild should generate a file output/react-fingerprint.js; all entrypoints that require react will need to reference that file in their HTML.

suggestedLocations would be everywhere the chunk is currently getting written to - eg A, B C, A+B, A+C, B+C, and A+B+C.

evanw commented 3 years ago

More confusing yet - if you have loaded chunk A, then navigate to an area that needs B, you could reasonably want B - A in order to avoid re-fetching libraries used by both.

That is the point of splitting up code like this. Sorry, using the same letter for an entry point input file and the corresponding output chunk was confusing.

Let's say the entry points are lower-case letters a, b, and c. Chunk B only includes code for entry point b but not entry points a or c. A given piece of code only ever ends up in one chunk so B - A (and really any intersection between any two chunks) is the empty set.

Code shared between entry points a and b (but not with entry point c) is placed in chunk A+B. Chunk A would import chunks A+B, A+C, and A+B+C to get all the code needed by entry point a. Chunk B would import chunks A+B, B+C, and A+B+C. When you move from chunk A to chunk B, the browser would avoid re-fetching the chunks A+B and A+B+C since it has already fetched them. The browser would only need to download chunk B and B+C (which represents entry point b - a). This would be more clear as a Venn diagram...

you could return ["react"] to indicate that esbuild should generate a file output/react-fingerprint.js

I think this is similar to the design I'm thinking of. You can return an optional manual chunk name from your plugin and if it is present, all code with that same manual chunk name will be forced to be in the same chunk and all code without that manual chunk name would be forced to be in some other chunk.

Right now I'm thinking that tree shaking would still be active for manual chunks, although it would only remove code that isn't used by any entry point. This will likely often result in dead code in your bundle because if a shared library is assigned to a manual chunk, all entry points which use that library would be forced to load all code in that library used by any entry point, even if it's only ever used by one entry point. Manual chunking will prevent esbuild from inlining code from that library that's only used by a single entry point directly into that entry point itself.

arcanis commented 3 years ago

Is it correct that bundle splitting only works at the moment for modules shared between multiple entry points? I think I'm hitting the first case defined in the OP: a single-page application, with many asynchronous imports when switching pages. No chunks are generated, causing the output to be super-large.

evanw commented 3 years ago

You have to explicitly enable code splitting with --splitting. It's not enabled by default. When enabled, every target of an import() expression is considered to be an entry point. So you should be getting multiple output files in this case. Also code sharing should kick in since then there are multiple entry points.

DanielHeath commented 3 years ago

Right now I'm thinking that tree shaking would still be active for manual chunks, although it would only remove code that isn't used by any entry point. This will likely often result in dead code in your bundle because if a shared library is assigned to a manual chunk, all entry points which use that library would be forced to load all code in that library used by any entry point, even if it's only ever used by one entry point. Manual chunking will prevent esbuild from inlining code from that library that's only used by a single entry point directly into that entry point itself.

This is why I think it's valuable to be able to (manually or otherwise) assign code to multiple chunks. If a library appears in multiple chunks, tree-shaking can be applied to each to remove the unused parts; this allows you to import a small function from a large library.

evanw commented 3 years ago

This is why I think it's valuable to be able to (manually or otherwise) assign code to multiple chunks. If a library appears in multiple chunks,

I'm not quite sure what you're saying. It might be the case that the upcoming version of esbuild's code splitting will already do what you're saying.

You could be saying that identical copies of the same piece of code could be present in multiple chunks. This is incompatible with how esbuild's code splitting works. When code splitting is enabled, a given piece of code must only ever live in a single chunk. If code lived in multiple chunks and those chunks were loaded simultaneously, you'd get bugs because two copies of a module would be loaded at the same time which isn't supposed to happen.

You can only get duplicate copies of code in separate output files when code splitting is off. But that's because then each output file is completely self-contained and never imports any other output files.
You could be saying that different pieces of code from the same library (e.g. npm package) are present in different chunks, without duplication. That's how esbuild's code splitting already works. Different files from the same library can end up in different chunks. Actually (and this is different than other bundlers) different pieces of code in the same file can also end up in different chunks. With esbuild's automatic code splitting, file boundaries don't really matter for side-effect free code. Files are automatically split up into pieces and each piece can potentially live in a separate chunk. You can read more details here.

This is important to point out because manually assigning a file as a single chunk the way I've envisioned it (when the manual chunk assignment feature is released) will actually prevent this optimization, since the assignment tells esbuild to put all code in that file in the same chunk. This will potentially improve caching if you anticipate future code using more code from that library, but otherwise doing this just creates dead code that is downloaded only to be unused. So manual chunk assignments are potentially a foot-gun.

Maybe you could point me to some resources that describe this more if you're talking about existing behavior from other bundlers?

tree-shaking can be applied to each to remove the unused parts; this allows you to import a small function from a large library.

This is always the case because esbuild's tree shaking is always active and can't be disabled. Side-effect free ESM code that is never used will always be dropped regardless of what library it's in. Basically the automatic code splitting settings should do this fine.

If you start manually assigning chunks you could end up causing some dead code in some cases. Specifically, if all entry points use different part of the same library and you assign all code in that library to the same manual chunk, all entry points will have to pay the cost of downloading all code in that library that any entry point uses (tree shaking is still active so parts of the library that none of the entry points use will still be removed).

If you didn't use a manual chunk assignment, then esbuild would automatically compute optimal chunk boundaries for shared code resulting in no dead code. If all entry points use different non-overlapping parts of that library, you could even get no shared chunk at all because only the relevant code will have been inlined directly into the respective entry point chunks.

DanielHeath commented 3 years ago

I was suggesting 1, on the basis that most libraries are stateless, and if you customize your config to have a stateful library appear multiple times, you are inviting bugs.

However, since you've implemented splitting within a file, I think it's not required at all. The situation where two entrypoints load overlapping-but-largely-distinct subsets of a large library is pretty damn hard to hit.

It's not tractable to figure out an optimal split in cases like that automatically (barring perhaps via symbolic execution, which would be an unreasonable amount of complexity to carry such a niche feature).

joeljeske commented 3 years ago

The other way is to pull out the hashes into an import map. That adds a level of indirection between the import paths and the actual hashed file names. It can lead to better caching because changing a dependency doesn't involve also changing the dependents, but import maps aren't a part of the web platform yet so this approach is presumably not viable for a while.

I am very interested in this approach and would like to see it as an option within esbuild. It could be argued that web linking is fundamentally flawed if a content-hash of dependencies appear inside a chunk. If so, then in most applications, minor changes would have cascading file changes.

I currently target SystemJS at runtime and use importmaps to link everything together. I am very interested in using esbuild but this would be a requirement in order to maintain longterm cacheability. Alternatively, esbuild could write out non hashed filename imports in its chunks, and I could generate an importmap using my own content hashes and rename the outputted files.

I am not aware of competition to the importmap spec so I am hoping it will pass through. Additionally, I suspect that any approach taken to support importmap name resolution would be easily swappable to another implementation/format due to the nature of this problem.

overlookmotel commented 3 years ago

I'm late to the party on this one, but a few thoughts on chunk splitting strategies...

Code examples below use the convention that a shared chunk which contains code required by entry points "a", "b" and "c" is named "a_b_c". ESBuild actually uses content hashes for filenames, but I'm ignoring that for now. Hopefully it makes sense.

OK, going back to basics for a minute...

1. The penalty of many chunks

In my (basic) understanding, if you're serving files over HTTP/2, the penalty for an app being split into a large number of chunks is minimal, as long as the import statements for the shared chunks appear in the entry point chunks, not nested within other shared chunks.

This will only only require 2 round-trips to the server - 1 to fetch a.js, and a 2nd to fetch a_b.js, a_c.js and a_b_c.js in "parallel":

// a.js
import ab from './a_b.js';
import ac from './a_c.js';
import abc from './a_b_c.js';
// Now do stuff with them

Whereas this will take 3 round trips, as browser has to wait for a_b.js or a_c.js to arrive before it knows it needs a_b_c.js:

// a.js
import ab from './a_b.js';
import ac from './a_c.js';
// Now do stuff with them

// a_b.js
import abc from './a_b_c.js';
// Now do stuff with it

// a_c.js
import abc from './a_b_c.js';
// Now do stuff with it

2. The advantage of many chunks

If a chunk is small, its content is less likely to change. Therefore (1) it will remain cached for longer and (2) it will rarely change filename due to content change, and so rarely cause cascading changes in files which import it.

3. Why do people want manual control over chunks?

I suspect the main reason is this along these lines:

Your app has 2 pages, home and about.
You use React.
You very rarely update React, but the rest of your app changes often.
ESBuild cunningly recognises that only home uses React.useState and only about uses React.useEffect. It puts useState in the home.js chunk, and useEffect in the about.js chunk. Great!
You update the code for home.
The home.js chunk is changed and the browser needs to download it again. It also ends up downloading the code for React.useState again, even though it's not changed, because that's bundled in the home.js chunk.
You are unhappy. You say "I wish I had manual control over the chunks so I could stop this madness!"

I contend: The user asking for manual control has correctly identified the problem, but not necessarily the best solution.

Rather than allowing the user to say "I want all of React in one chunk", you could allow them to say "I don't want React to be mixed into a chunk containing other non-React code".

The 1st gives you:

// home.js
import {useState} from './react.js';

// about.js
import {useEffect} from './react.js';

// react.js
export function useState() {}
export function useEffect() {}

home.js and about.js are both importing code they never use.

The 2nd would result in:

// home.js
import {useState} from './home_react.js';

// about.js
import {useEffect} from './about_react.js';

// home_react.js
export function useState() {}

// about_react.js
export function useEffect() {}

The two entry points only download the code they need, but home_react.js and about_react.js will not change unless React is updated, so they can be cached for longer.

This would actually be quite easy to implement without changing ESBuild's splitting algorithm. You just need to introduce a pseudo-entry point import * from 'react', and ESBuild will split the chunks as above.

I'm not convinced that most users really do want manual control. I think most would prefer the build tool to do everything for them if it can do it as well as they could.

4. Conclusions

ESBuild's algorithm very cleverly produces the optimum split chunks for a given set of code. It does not, however, optimize for:

how the code is loaded and
caching - reducing chunk churn as the codebase changes over time

(1) refers to my example in "The penalty of many chunks" above. Hoisting all imports to the entry chunks might require more bytes of output, but it might still load faster due to less round-trips to the server.

The beginnings of a possible solution to (2) is discussed above. You could even take a more radical approach and aim to produce as many chunks as possible. That way, caching will be even more durable, at the cost of a huge number of chunks.

However, that does assume HTTP/2 and therefore that producing loads of chunks is not a problem. A really good splitting strategy would be adaptable for the circumstances of either HTTP/1 or HTTP/2.

Perhaps all cases could be covered with two user settings:

List of modules unlikely to change often, in priority order
Limit on total number of chunks / limit on number of chunks per entry point (e.g. no more than 5 imports for any entry point)

ESBuild would split the code into chunks based on these 2 constraints: start with an unlimited number of chunks, and combine them as required to hit the desired number, guided by the list of what is most important to split off.

Sorry that's really long. It's a very interesting area!

jlfwong commented 3 years ago

RE: small chunks, they are more cache-optimal, but they also result in more total network bandwidth needed because compression is better for larger chunks because of de-duplication of content between chunks (e.g. len(gzip(a + b)) < len(gzip(a)) + len(gzip(b))). See e.g. https://blog.khanacademy.org/forgo-js-packaging-not-so-fast/

overlookmotel commented 3 years ago

@jlfwong Thanks. That's interesting. Well I did say my understanding was basic!

So a vast number of chunks is a bad idea, but still there's likely to be a "best middle place", which may well be more chunks than are currently generated. That point will differ from site to site depending on, for example, how much of their traffic is repeat visitors, and therefore how much caching comes into play.

I wonder if the two settings I proposed above would be enough to allow people to tune it for their needs, without getting in to manual splitting with it's downsides and overhead for the developer? Or is it too simplistic?

DanielHeath commented 3 years ago

"Create an unused entrypoint with import React from 'react'; window._react = React" is a method of manually adjusting the bundle splitting. Not a great one, admittedly, but I'm not clear there exists a good answer.

overlookmotel commented 3 years ago

@DanielHeath Just to be clear, what I was suggesting is that ESBuild provide an additional option (e.g. splitOn: ['react']) which would internally create these dummy entry points, but not output them.

But yes, for anyone who wants some form of manual chunk splitting right now, this is a workaround to do it.

ephemer commented 3 years ago

Hi there!

I'm just wondering if code splitting with IIFE is still interesting to @evanw and others. As much as we'd love to use esm for our browser output, we currently need to use IIFE due to naming conflicts with global variables in external library code. I assume our case is fairly common, because esbuild's default for target: "browser" is to use IIFE. I also assume that browser output would stand to gain the most benefit from code splitting compared to other environments.

tl;dr I'm wondering if there's a way to use esbuild differently to get code splitting working today (for example somehow working around the conflicts in the global namespace and using esm directly) and if not whether the iife format is still destined to receive the code splitting kiss of life at some point. I know @evanw has mentioned a couple of times that it's on the cards but the last time was in November and things can certainly change over time.

I do want to add that I understand this is just one of many features and bugs on the esbuild radar, so would totally understand any answer here or none at all. In any case, thanks so much for esbuild, it is an awesome piece of software engineering and such a breath of fresh air to the ecosystem.

jbms commented 3 years ago

The lack of support for IIFE is also particularly unfortunate because currently Firefox does not support esm for web workers, which means esbuild code splitting cannot be used for web workers.

jpike88 commented 3 years ago

My current bundler splits app-specific files into an app.js and all node_modules/external files into a vendor.js

Does anyone else do this and is there a real benefit in it? Feels weird to have esbuild squishing everything into a one big js file.

retorquere commented 3 years ago

I would if I could. If anything it isolates load failures to a smaller file to do diagnosis on. For the same reasons I'd prefer to be able to split on vendor.js, common.js, and page-specific bundles. Leaves a greater chance that parts will load.

sachinahya commented 3 years ago

Does anyone else do this and is there a real benefit in it? Feels weird to have esbuild squishing everything into a one big js file.

The main benefit comes when you include content hashes in the output filenames and configure your server with long term caching headers. Every time you build, any chunks that don't change will retain the same filename so that users who already have cached copies can continue to use those and only redownload the updated chunks. Generally vendor chunks are much larger and change less frequently than your app chunks.

jpike88 commented 3 years ago

@evanw any ideas of feasibility, difficulty, timeline of vexing able to produce a vendor bundle? I’d love to help out but my Go ability is nonexistent

arcanis commented 3 years ago

I'm currently working to adapt our large codebase so that it compiles with esbuild, but I'm still unsure what's the best path to production given the lack of IIFE bundle splitting. The current options I see are:

Don't use bundle splitting, which isn't an option since we're talking about a very, very large bundle.
Don't use esbuild in production, which we'd like to avoid since it could lead to differences in prod and dev behaviours
Post-process the ESM bundle splitting with something like rollup-plugin-iife, which would require to parse the bundle and make transformations after ESBuild generated the ESM chunks. This is likely the option we'll end up using, but I suspect it'll affect the build performances quite a lot. With esbuild being so fast, postprocessing seems likely to end up the bottleneck...!

Which options have people picked so far and what were the results? @evanw is IIFE bundle splitting still on the roadmap, and is there anything a company could do to help (through sponsorship or external contributions, perhaps)?

matthiasg commented 3 years ago

@arcanis Isn't it possible to write an IIFE entry file around the ESM modules ? Or do you want to target non-green browsers/or the firefox issue mentioned above ?

ephemer commented 3 years ago

I would like to quickly chime in on this discussion again because it's been a few months since I last wrote. In July I put a bunch of work into this and was able to get code splitting working fine for esm. Unfortunately I don't remember right now exactly what needed changing for this to work in our setup – the main thing was probably changing the input files so they really did use es module syntax by cleanly importing and exporting the needed parts. We have a large legacy codebase written in Meteor and I put a bunch of work into removing the Meteor magic and using real imports.

With that in mind, I would now request instead that the default for target: "browser" change to bundle esm, given that more and more browsers support it. I don't consider it that urgent or important though, and it would be a breaking change. Maybe something to consider for later though.

As for Firefox not supporting esm @jbms, in our setup we have a loader plugin that creates a separate bundle for our worker by calling the esbuild API again from the plugin, just to bundle the worker file. There we use a different set of settings (in our case we do not need splitting at all for our worker bundle). Maybe that workaround is possible for you too.

jbms commented 3 years ago

Thanks for the suggestion. When bundling and not using code splitting, is there any significant difference between esm and iife format? I do want splitting for worker bundles, so unfortunately your suggestion does not help me there.

retorquere commented 2 years ago

I'm in a similar bind -- I'm writing a Zotero plugin, and esm modules are not supported there.

stefanoverna commented 2 years ago

Hi, I'm curious to know if dynamic expressions in import()s are currently working, and if there's plan to support them

import(`./locale/${language}.json`).then((module) => {
  // do something with the translations
});

I've done some quick tests and it seems that the ./locale/${language}.json part is left unchanged, while with regular imports it correctly modifies the path to include ie. [chunk].

Thanks for your beautiful work!

evanw commented 2 years ago

No, supporting these expressions by default is out of scope: https://esbuild.github.io/api/#non-analyzable-imports. You can either support them by switching on language and returning a statically-determined import() expression based on the value of that variable, or use another bundler such as Webpack that does this. In the future, it may be possible to handle code like this with a plugin, but that doesn't currently work.

pft commented 2 years ago

Code splitting with dynamic import() of a JSON file that has key names at the root that would not be valid JavaScript identifiers yields incorrect named exports:

File a.json:

{ "x-y": "foo" }

File imp.js:

const getJSON = () => import("./a.json");
getJSON();

Build an esm bundle with splitting:

[user@dom0 ~]$ esbuild --splitting --bundle --format=esm --outdir=app imp.js

Output:

File app/a-ZAVKVQOM.js:

// a.json
var x_y = "foo";
var a_default = { "x-y": x_y };
export {
  a_default as default,
  x_y as "x-y"
};

File app/imp.js:

// imp.js
var getJSON = () => import("./a-ZAVKVQOM.js");
getJSON();

I propose to simply not try and export those fields separately.

By the way, the JSON module proposal does not do named exports at all for JSON files, precisely because of this reason (and because it's conceptually a single thing); this reasoning is at the bottom of that page.

evanw commented 2 years ago

Code splitting with dynamic import() of a JSON file that has key names at the root that would not be valid JavaScript identifiers yields incorrect named exports:
// a.json
var x_y = "foo";
var a_default = { "x-y": x_y };
export {
  a_default as default,
  x_y as "x-y"
};

This is perfectly valid JavaScript. It uses a new JavaScript syntax feature called Arbitrary Module Namespace Identifiers. I can understand the confusion because this feature was somehow added even though it bypassed the TC39 proposal process, and was therefore not ever really announced despite being a significant addition to the language. But it has already been added to the ECMAScript specification and support for it has shipped in Chrome 90+, Firefox 87+, and node 16+. It's a real JavaScript language feature. As with all new JavaScript language features, you need to make sure to set esbuild's --target= setting appropriately to tell esbuild to not use syntax features that are newer than what your target environment supports. For example, if you pass --target=node14 the x-y export will not be generated.

I propose to simply not try and export those fields separately.

It's true that this is a bundler-specific extension, and not part of a standard. Node doesn't behave this way for example. But it's a useful extension because it lets you import specific fields from the JSON file without importing the whole thing. For example, you can import { version } from './package.json' and all fields except version will be tree-shaken away. With the Arbitrary Module Namespace Identifiers feature you can also import { 'x-y' as x_y } from './a.json' if you need to.

pft commented 2 years ago

Thanks for clarifying @evanw, about this new spec and how to deal with it if the intended environment does not support it.

One question though: In dynamic imports, there is no syntax to import stuff like that, or am I missing something?

evanw / esbuild