ljharb / js-traverse

MIT License
45 stars 8 forks source link

Optimize bundle size #12

Closed kibertoad closed 1 month ago

kibertoad commented 1 month ago

Recently there was a huge bump in total bundle size due to introduction of new dependencies in traverse, see https://x.com/himseIf_65/status/1807923912493486182/photo/1

Would you be open to PRs that preserve current functionality, but cut down on the bundle size by either removing, or replacing some of the dependencies that were introduced?

ljharb commented 1 month ago

Always, but do note the current functionality includes engine support and robustness (which always applies to modern engines).

kibertoad commented 1 month ago

@ljharb Based on https://github.com/ljharb/js-traverse/blob/main/.github/workflows/node-tens.yml, it seems that supporting Node 10 as a minimum should be OK. Can you confirm that?

ljharb commented 1 month ago

https://github.com/ljharb/js-traverse/blob/main/.github/workflows/node-aught.yml

kibertoad commented 1 month ago

@ljharb Would you consider having a semver major, that would drop obsolete versions, for those who are willing to upgrade for the sake of more reasonable bundle size? Based on the latest (unfortunately, overly heated at a times) discussion around "supporting Node 0.4 forever", it is very clear where the ecosystem consensus is.

ljharb commented 1 month ago

Major releases contribute to much higher bundle sizes, via duplication.

kibertoad commented 1 month ago

@ljharb I assume what you are saying is that you are against removing support for older Node.js under any circumstances? And that any removal of excessive dependencies needs to be 100% compatible with Node 0.4?

ljharb commented 1 month ago

I’m saying that the downsides of breaking changes rarely outweigh the ecosystem cost they impose.

dependencies are not excessive by definition, unless they can be removed without breaking changes - and if they can be, then it’s perfectly fine to do so.

kibertoad commented 1 month ago

I don't want to get into an argument that is going to end up nowhere, and will only fuel future flame wars and conspiracy theories.

So just so that the rules of a game are clear: 1) Support for versions as old as Node 0.4 stays no matter what; 2) Dependencies can be removed, as long as they can be replaced with a native solution, or a smaller library with less dependencies, as long as doing so does not contradict 1

Is that an accurate summary? Am I missing something?

ljharb commented 1 month ago

I’m not saying “no matter what”, but absent sufficient value, I’m not going to make a breaking change.

Any change that is non-breaking is always worth considering.

kibertoad commented 1 month ago

absent sufficient value, I’m not going to make a breaking change.

What would constitute sufficient value?

ljharb commented 1 month ago

I don’t have objective criteria in mind, but “bundle size” (when bundlers can change whatever they want without requiring any changes in the ecosystem) isn’t it.

kibertoad commented 1 month ago

when bundlers can change whatever they want without requiring any changes in the ecosystem

Can you elaborate on that? If an app depends on library A, and library A depends on library B, there is nothing that bundler can do about it, both library A and library B go into the bundle.

ljharb commented 1 month ago

Of course there is. There’s npm overrides, and the bundler can rewrite literally any part of the application since it’s producing the bundle. Airbnb was doing this with polyfills (ones it’d already applied in a separately optimized bundle) way back in 2016; it’s not a new technique or capability.

kibertoad commented 1 month ago

@ljharb Wouldn't doing that require to basically fork traverse and produce a version with less dependencies? Otherwise you can't just remove dependencies and expect things to work. And I'm not sure how is that less disruptive for end users than releasing a semver major.

ljharb commented 1 month ago

Sure, that’s one alternative. You can also do rewrites in place, like replacing eg var split = require('string.prototype.split'); with var split = Function.bind.call(String.prototype.split);

It’s less disruptive because the default is safe, and most applications and websites simply don’t have meaningful size constraints they need to worry about.

ottobunge commented 1 month ago

@ljharb My issue with that recommendation is that it requires a lot of steps from users, which might not be always as experienced to know all the proper techniques to reduce bundle size.

Extra dependencies that are unjustified because they don't meaningfully contribute to the project I think should be treated as malignant, this package went from being self contained, to a spider web of dependencies.

What happens when the dependencies change their APIs for some reason, they stop actively maintained or worse it ends in a supply chain attack?

And if all this is done with the goal of backwards compatibility, why?, who was asking for it? where ware the issues?

ljharb commented 1 month ago

@ottobunge it's totally true that it requires more effort, but most users don't need to worry about bundle sizes, since most applications simply aren't that performance-sensitive.

"dependency count" simply isn't an objectively bad number to have increase; that's just left-pad style small package FUD. If something malicious happens with any dependency, you simply replace it.

akx commented 1 month ago

Bundle size isn't the only issue. Where the previous version of traverse was a single download, the current one is in the worst case up to about 50 (if I'm visually grepping the new dependency graph correctly).

Fetching those dependencies has an ecological cost too - I don't think we should we burning cycles, bandwidth and capacity unnecessarily.

ljharb commented 1 month ago

This isn't a debate worth having yet again. The issue was asking about PRs to reduce dependencies; I've explained the constraints.

vorant94 commented 1 month ago

Sure, that’s one alternative. You can also do rewrites in place, like replacing eg var split = require('string.prototype.split'); with var split = Function.bind.call(String.prototype.split);

Wow, this is something new! So instead of library shipping minimal amount of custom code and relying on runtime built-ins effectively allowing user IF necessary to polyfill the code with bundler, you are up for vise versa?! Ship the maximum amount of custom code to cover the oldest runtime version and IF user DON’T need it, user should upgrade the code with bundler? I don’t even know the correct term for it, is it reverse-polyfill?

And all of it is when we talk about the server, where user fully controls runtime version… like the whole polyfill stuff came up at least partially because devs wanted to use new features, but were forced to support uncontrolled potentially legacy client runtimes, so we came up with bundlers.

Bundlers are meant to be user for client code, not for server. They are meant to increase the code compatibility, not clean up redundant shipped-in polyfills

MeLlamoPablo commented 1 month ago

most users don't need to worry about bundle sizes, since most applications simply aren't that performance-sensitive.

@ljharb this is a very bold assumption to make. I don't have data to back this but I'd wager that if a survey was conducted the % of developers concerned with this would be significant.

On the other hand I'd also guess that the % of developers requiring Node 0.4 support would be orders of magnitude lower.

ljharb commented 1 month ago

@MeLlamoPablo "node 0.4" is a strawman; even if i required node 22, i'd need almost all the same dependencies, because most of them are about robustness, which forever applies.

As for surveys, when I see one whose demo isn't massively skewed from empirical real world usage then I'll make decisions solely based on them :-)

vorant94 commented 1 month ago

"dependency count" simply isn't an objectively bad number to have increase;

yes, it is. even if it may not affect the experience of the end user (the person who uses software), it still impacts the lib user (the person who writes software):

While the whole ecosystem moves towards one direction, where you have more and more stuff built-in into runtimes, you deliberately hold ecosystem being dependent on your 3-rd party stuff even those users who use the latest runtimes. At this point of time I consider you being actively harmful to JS in general. "Just fork it" won't solve the thing for me, because I'm not dependent on your packages, I am depended on packages that are dependent on your packages (good luck forking eslint and all affected plugins, for example)

[^1]: most of them usually aren't about the project deps, but about deps of the deps. so the developers just ignore them until there is eventually an update of the actual project dep. this leads to increased ignorance to those monitoring system warnings in general, which increases the chances of one actual real warning about project dep to be unnoticed and unaddressed because of was of surrounded by 10+ other nothing-to-do-about dep's dep warnings

ljharb commented 1 month ago

Surface area of a supply chain attack comes from the number of publishers in your graph, not the number of dependencies. You'll find that the vast majority of the dependencies everyone's so concerned about have a single publisher.

npm audit and dependabot warnings are good - they indicate a potential security issue - and fewer deps actually increases the likelihood of an undiscovered security issue. Code reuse improves security.

For serverless environments, you'd probably want to bundle anyways, because that gives you the fastest cold start time and the smallest disk size.

Increased dep count doesn't necessarily increase bundle size, and in fact, often decreases it - instead of the code being inlined in multiple places, it's deduplicated to one place.

A developer's hard drive is not a resource-constrained environment; storage space is cheap, and both pnpm and yarn have solutions that minimize the disk space impact.

At this point of time I consider you being actively harmful to JS in general.

Personal attacks are violations of our CoC, destroy your credibility, and weaken your argument. I suggest you avoid making them.

vorant94 commented 1 month ago

Surface area of a supply chain attack comes from the number of publishers in your graph, not the number of dependencies. You'll find that the vast majority of the dependencies everyone's so concerned about have a single publisher.

having 1 lib to be dependent on is having constant 1 publisher to be dependent on. having 10 libs of the same 1 publisher to be dependent on is having a chance of icreasing the number of publishers at any given point of time

npm audit and dependabot warnings are good - they indicate a potential security issue - and fewer deps actually increases the likelihood of an undiscovered security issue. Code reuse improves security.

npm audit and dependabot warnings are good unless they become just a white noise, that is ignored all together, that is how human psychology works

For serverless environments, you'd probably want to bundle anyways, because that gives you the fastest cold start time and the smallest disk size.

making bundling for serverless environments more of a "must" by potentially reaching the size hard limit rather than "should" is a bad developer experience and obviously a deal breaker for less-experienced developers. compare "just works" using latest runtime version with lowest number of deps to "bundle, minify, reverse-polyfill" your code even using latest runtime version with increased number of deps

Increased dep count doesn't necessarily increase bundle size, and in fact, often decreases it - instead of the code being inlined in multiple places, it's deduplicated to one place.

simply not the case, because alternative to var split = require('string.prototype.split'); isn't var split = Function.bind.call(String.prototype.split);. the alternative is to not having any of such at all and just use "foo".split();

A developer's hard drive is not a resource-constrained environment; storage space is cheap, and both pnpm and yarn have solutions that minimize the disk space impact.

its not up to the library author to decide, but to the library consumer, to a developer. a developer that is using the latest runtime version may consider support of older ones to be obsolete as well, but its not up to him to decide, it's up to library author. let's not cross the lines

pnpm and yarn are solving the problem that could be avoided in the first place in such case. also lib author can't control what consumer of the lib is using, it can be any package manager or even other node-compatible runtime

not only the burden of bundling server-side code is on the consumer side, but also the burden of optimization his/her machine drive, meanwhile it could be avoided from the beginning

Personal attacks are violations of our CoC, destroy your credibility, and weaken your argument. I suggest you avoid making them.

I didn't say a word about you as a person, only abou work you are doing for the ecosystem. I don't consider this being personal. Maybe the wording was misleading, but the point remains

kibertoad commented 1 month ago

@ljharb let's lock this issue, I don't think it will bring any additional valuable discussions at this point