Chunks - Githubissues

jamiebuilds commented 6 years ago

There are generally a couple different ways to run tools in a Bolt repo:

Globally (across all workspaces at once)
By Workspace
- Parallel
- Chunks by dependency tree
- Serial

When you have a tool that could either run globally or by workspace, you have to make a tradeoff decision:

Does the tool take a long startup time? By workspace is probably slower, having to startup over and over.
Does the tool have a long process time? Globally is probably slower, would benefit from parallelization.

Right now we don't offer anything in between these two extremes. The different "by workspace" modes don't help at all because they all spawn a new process for every workspace.

(./) $ tool "./packages/*"

(./packages/a) $ tool "."
(./packages/b) $ tool "."
(./packages/c) $ tool "."
(./packages/d) $ tool "."

But what if we could break workspaces into "chunks" and spawn a smaller number of processes?

(./) $ tool "./{packages/a,packages/b}"
(./) $ tool "./{packages/c,packages/d}"

This could allow us to increase parallelism of a tool while also reducing the number of processes that it starts up.

CLI

I would focus on just when the --parallel flag is on initially.

bolt workspaces exec --parallel --chunks -- tool "()"

If you wanted to limit the number of items to place in a chunk at a time:

bolt workspaces exec --parallel --chunks --max-chunk-size 10 -- tool "()"

jamiebuilds commented 6 years ago

I already have a tool for most of the logic here too: https://github.com/jamiebuilds/chunkd

jamiebuilds commented 6 years ago

Note: We're gonna need to have some syntax for passing the chunks into tools, we can't use env variables because they evaluate before we see them

lukebatchelor commented 6 years ago

What do you mean by "before we can see them"?

Also, am.i right then in assuming this would require custom logic in each tool (or config of tool) to support it? Or do you have something else in mind?

jamiebuilds commented 6 years ago

bolt workspaces exec --parallel --chunks -- tool $CHUNK

$CHUNK is evaluated to "" outside of the bolt process, so we just see ""

jamiebuilds commented 6 years ago

Also, am.i right then in assuming this would require custom logic in each tool (or config of tool) to support it? Or do you have something else in mind?

Any tool that supports passing globs would support this. ESLint for example

lukebatchelor commented 6 years ago

$CHUNK is evaluated to "" outside of the bolt process, so we just see ""

Ah yea, I was thinking about tools with a config.js which would mean bolt could set env vars when it spawns

jamiebuilds commented 6 years ago

GNU parallel uses {} and others: https://www.gnu.org/software/parallel/parallel_tutorial.html#Replacement-strings

boltpkg / bolt

Chunks #199

CLI