webpack / webpack

A bundler for javascript and friends. Packs many modules into a few bundled assets. Code Splitting allows for loading parts of the application on demand. Through "loaders", modules can be CommonJs, AMD, ES6 modules, CSS, Images, JSON, Coffeescript, LESS, ... and your custom stuff.
https://webpack.js.org
MIT License
64.67k stars 8.8k forks source link

Treeshaking with side effect free package is unexpectedly slow #15643

Open archfz opened 2 years ago

archfz commented 2 years ago

Bug report

What is the current behavior?

Given that one uses package that has multiple ESM exports, and marked as side effect free, all sub-trees seems to be analyzed or even compiled, which causes great performance loss. Inclusion (or not inclusion) into final bundle works correctly.

If the current behavior is a bug, please provide the steps to reproduce.

I've created this setup to demonstrate https://github.com/archfz/webpack-treeshake-issue

What is the expected behavior?

I am not sure if this is a bug or this is intended to work like this. I would expect the subtree that is not imported to not be analyzed and/or compiled at all. Given the example repository I would expect the same webpack compile duration using either some-module or some-module-less. Instead there is a big performance difference.

Given this performance I do not see the use of tree-shaking versus used-exports optimization. Indeed with sideeffectless ESM it is quite easier and faster to determine when to not include unused exports, than with used-exports optimization, but the big gains is the latter where the whole sub-tree is completely skipped.

Other relevant information: webpack version: 5.72.0 Node.js version: 16.x Operating System: Ubuntu 22 Additional tools: N/A

vankop commented 2 years ago

yeah, thats how current webpack version works.. tree-shaking happens after building a tree.

Also this will not work great with incremental builds and cache. For incremental build in your case all sub tree should be build since all subtree will be invalidated in cache ( if e.g. on watch you will change imports ). Right now only affected module will be invalidated. ( Since we store all modules tree and "side effects free" does not affect module tree )


Maybe this could be improved in the next major version

archfz commented 2 years ago

Yeah this would then be a real nice feature.

Question still remains: is there only treebuilding happening? or is there also compilation upon the subtree? or something else that is expensive? Because I don't understand why it's 12x slower. Parsing some imports in couple of files shouldn't take that long.

In case of incremental builds I am not sure I follow. Why does the whole subtree need to be invalidated in cache for that file that is changed? As I see upon change only imports need to be checked what was added and what was removed. If treeshaken module was imported on change only then load that subtree and add to cache. Wouldn't this work similarly to how new imports are added from new unseen files for webpack?

vankop commented 2 years ago

is there only treebuilding happening?

all modules are parsed.

or is there also compilation upon the subtree?

not sure what you mean by compilation.. module graph is optimizing after exports analyze ( side effects apply here ). no output generated for "tree-shaken" modules

Why does the whole subtree need to be invalidated in cache for that file that is changed?

Not modules itself by exports that this modules provide. e.g. in case A module import B module and B imports unused C module (A -> B -> C and C is unused and side effect free) So on first build A and partial B module exports is known. If we change module A the way that C module is used, this will trigger exports invalidation for module B + parsing/analyze module C subtree. Things become a lot more complex when one module is imported in different modules ( common scenario )

Maybe this could be optimized, but definitely things are more complex comparing to how right now this works ( building tree, then optimize )

archfz commented 2 years ago

Ok I am no expert in this. Hope it's possible tho.

By compilation I meant running the files through loaders.

vankop commented 2 years ago

By compilation I meant running the files through loaders.

yes, this is a part of creating module graph.

archfz commented 2 years ago

yes, this is a part of creating module graph.

Just to make sure I understand correctly. In the above example repo babel will be ran on each file loaded from some-module and its dependencies?

Why is this needed to build the module graph? Ofc talking here about third party packages with precompiled JS. Which is almost always the case with packages.

vankop commented 2 years ago

In the above example repo babel will be ran on each file loaded from some-module and its dependencies?

yes

Why is this needed to build the module graph?

because loaders transforms module code and we don't know how resulted code will looks like.

archfz commented 2 years ago

I will do some more fiddleing to see the impact of the discovery vs loaders running on graph building.

But my initial thought is that for graph building needing loaders to run through the code is an edge case, and maybe should be able to opt out. As what is and what isn't imported in most cases for sideffectfree js won't be changed by loaders. But maybe I am in the wrong here.

archfz commented 2 years ago

I've made another branch https://github.com/archfz/webpack-treeshake-issue/tree/no_babel_on_modules and here I transformed all jsx in the 3rd party packages to js. I've also made babel not parse files from node modules.

With this change it's confirmed that the loaders are also run on module graph building. Now the slowdown is down from 12x to only 3.5x. Which is a big improvement, but still not quite there. This would then essentially mean that module graph building of unimported side effect free modules still add a lot of wasted resource usage.

My argument still stands. I don't see the point, in 99% of cases, running the loaders on js files prior to module graph building. Babel here is a very good example, as most probably all projects configure this loader to compile all files, even ones loaded from third party packages, due to the fact that one cannot know for sure what that package targets as ecmascript version.

Would it be easily possible to introduce an optout flag for loaders for module graph building? Or are there any counter arguments?

webpack-bot commented 2 years ago

This issue had no activity for at least three months.

It's subject to automatic issue closing if there is no activity in the next 15 days.

archfz commented 2 years ago

Maybe loaders should mark themselves sideeffect free, or by default be and can opt out.

webpack-bot commented 2 years ago

Issue was closed because of inactivity.

If you think this is still a valid issue, please file a new issue with additional information.

TheLarkInn commented 1 year ago

So idea is: