microsoft / monosize

Bundle size tooling for monorepos
MIT License
25 stars 9 forks source link

Perf: parallelizing work? #74

Open benkeen opened 3 weeks ago

benkeen commented 3 weeks ago

Hi! We've added monosize to one of our component packages and have somewhere around 200 separate fixtures set up. Perf-wise, this takes ~27 minutes on the CI which is a pretty big chunk of time. Are there any options to parallelize the work that needs to be done? Something like jest sharding, like monosize measure --shard=1/4 would work.

And related to this, I don't see an option with measure to just run it on a single file / globbing pattern. That would be useful sometimes.

benkeen commented 3 weeks ago

Also, question about this application works. I assumed it would do a single pass of webpack and use entry points for each of the fixture files, and configure it so that each chunk would contain all the deps, thus ensure the size measurement would be accurate - but looks like maybe it runs a webpack build on every fixture separately?

layershifter commented 3 weeks ago

Perf-wise, this takes ~27 minutes on the CI which is a pretty big chunk of time. Are there any options to parallelize the work that needs to be done? Something like jest sharding, like monosize measure --shard=1/4 would work.

And related to this, I don't see an option with measure to just run it on a single file / globbing pattern. That would be useful sometimes.

Sharding implementation is out of scope for monosize, but adding a possibility to use globbing pattern sounds reasonable to me. Implementation wise, it's just there:

https://github.com/microsoft/monosize/blob/351ae8e05fa2c01876120ff151f0255b50362c91/packages/monosize/src/commands/measure.mts#L42-L45

I would love to have this feature :)


Also, question about this application works. I assumed it would do a single pass of webpack and use entry points for each of the fixture files, and configure it so that each chunk would contain all the deps, thus ensure the size measurement would be accurate - but looks like maybe it runs a webpack build on every fixture separately?

Yes, it runs a separate build to ensure that module/chunk graphs are not shared between fixtures. You can switch to monosize-bundler-esbuild or write a custom adapter that satisfies your requirements.

this takes ~27 minutes on the CI

At the same time, this worries a bit. monosize is designed to be run on artifacts i.e. build results that will be shipped to NPM. This ensures that results are matching what consumer will get + avoids additional loaders/plugin configuration that makes Webpack fast to build. How complex is your Webpack config?

benkeen commented 3 weeks ago

I would love to have this feature :)

I'll see what I can rustle up. :)

Yes, it runs a separate build to ensure that module/chunk graphs are not shared between fixtures.

Ah, I see. It's funny, I've spent so much time trying to keep chunk sizes small that this is really an opposite use-case: we want to tell webpack to generate chunks without sharing any code. Though I do wonder if without a clear hierarchy of when code will execute, Webpack would have to duplicate content in every chunk? I could be wrong, but I think in order to know what code can be eliminated in a chunk it needs to know what other code must have been loaded first. A flat array of entry points (the fixture files) shouldn't be enough info for Webpack to do anything clever so each chunk would contain everything it needed with everything duplicated, no? When I get a bit of time I'll play around with it.

How complex is your Webpack config?

Quite. A co-worker just raised this as a possible cause for the slowness as well. I'll do some benchmarks next week for a few fixtures that build just fine with the out-the-box webpack config so see how much slower they become with ours.

Thanks!

swese44 commented 3 weeks ago

Though I do wonder if without a clear hierarchy of when code will execute, Webpack would have to duplicate content in every chunk? I could be wrong, but I think in order to know what code can be eliminated in a chunk it needs to know what other code must have been loaded first.

@benkeen we don't want any of those optimizations in the monosize webpack config, we want every component's measurement to be its "full cost of doing business" and include all dependencies it pulls in without splitchunks magic etc. The goal here is to help engineers understand how "heavy" their component is in isolation, understand how their code changes affect a component's weight, and provide a mechanism to catch feature-level bundle regressions before merging a PR. We should only add custom webpack config if there's a build issue which forces us to add a workaround.

If you separately want to explore a single shared webpack build (still per project in the monorepo) to improve build times, as long as each component is defined as an entrypoint in the generated weback config and not a lazy import then webpack's default behavior should be to duplicate all dependencies across all entrypoints (which is what we would want).

benkeen commented 3 weeks ago

If you separately want to explore a single shared webpack build (still per project in the monorepo) to improve build times, as long as each component is defined as an entrypoint in the generated weback config and not a lazy import then webpack's default behavior should be to duplicate all dependencies across all entrypoints (which is what we would want).

Exactly. That's what I figured webpack would do. Let me look into that and try it out, thanks!

benkeen commented 3 weeks ago

PR here to run monosize measure on specific fixtures by pattern/filename.