casey / just

πŸ€– Just a command runner
https://just.systems
Creative Commons Zero v1.0 Universal
21.71k stars 484 forks source link

Is there a way to run independent tasks in parallel #626

Open theHamsta opened 4 years ago

theHamsta commented 4 years ago

I wanted to ask whether we could extend just that it runs independent dependencies in parallel similar to make. Or is this feature somewhere hidden somewhere?

casey commented 4 years ago

It isn't possible at the moment, but I think this would be a cool feature. I think the best way to implement it would be to add annotations, and then define an annotation that makes a recipe run in the background.

roblav96 commented 4 years ago

@casey I'm absolutely in love with using a justfile, coming from a package.json npm scripts background.

Almost every npm script I write uses npm-run-all --parallel.

Keep up the great work friend! Cheers 🍻

casey commented 4 years ago

Thanks Robert, I appreciate kind words!

I'm glad to learn about npm-run-all --parallel, and agree that this would be a very worthwhile feature.

I think that there are a few features that are languishing, awaiting annotations. I've been dragging my heels on adding annotations, but since there are a bunch of worthy features that need them, hopefully I'll get around to it sooner rather than later.

roblav96 commented 4 years ago

No worries, nothing but time around here. lol

I ended up replicating somewhat of the same workflow in combination with Nukesor/pueue πŸ˜‚

casey commented 4 years ago

Definitely sub-optimal than having it built into Just, but glad you found something that works! I'll have to check out pueue, it looks dope.

theHamsta commented 4 years ago

My solution is to have a just command that's invoking make that's invoking just. :smile: It works. Dunno whether it should work.

casey commented 4 years ago

My solution is to have a just command that's invoking make that's invoking just. πŸ˜„ It works. Dunno whether it should work.

That sounds like a highly reasonable solution :)

pw1l4

casey commented 4 years ago

I think I misunderstood this motivation behind this feature, and thus how it might be implemented.

Is the desire to run some recipes in a justfile in parallel, or all recipes in parallel? Let's call the former selective parallelism, and the latter universal parallelism.

I was thinking that people wanted selective paralellism, so perhaps you could annotate a recipe to say that its dependencies should run in parallel. So, for example, to run a, b, and c in parallel when running foo:

#[parallel]
foo: a b c

But, I think people actually want universal parallelism, people want to run all recipes in a justfile in parallel, or at least be able to pass a flag that all recipes should run in parallel.

I think this latter behavior is probably more useful, since it requires fewer annotations and thought on the part of users, and since parallelism could be selectively limited through the use of dependency constraints.

Currently, Just runs dependencies of a recipe in-order (barring dependencies between those dependencies), which users might have come to rely on.

There are a few possibilities:

  1. A cli flag that lets you run recipes in parallel, like --parallel. This has the downside that if a justfile relies on dependency ordering, it will break when run with the flag.

  2. A setting that says "this justfile expects its recipes to run in parallel", maybe parallel := true, that enables universal parallelism.

Does this seem like a good summary of what people want?

mortoray commented 4 years ago

For my use case, from #676 I want selective parallelism, and I'd prefer to have this specified in a target. Command line parameters remove some of the usefulness of "just" remembering what I want.

casey commented 4 years ago

For my use case, from #676 I want selective parallelism, and I'd prefer to have this specified in a target. Command line parameters remove some of the usefulness of "just" remembering what I want.

Can you elaborate on why universal parallelism is undesirable, or why it would break the justfile?

mortoray commented 4 years ago

Can you elaborate on why universal parallelism is undesirable, or why it would break the justfile?

I collect many different types of tasks in my justfile, these have different running requirements.

That said, I guess it depends on how the parallelism is specified. My sequential tasks don't use dependencies, they instead do recursive invocation of "just".

Thinking about that, perhaps a --parallel flag would be okay for my case, since I would have a target the invokes "just" with that flag, specifying the targets I want.

That is, I think the details of how this is implemented will decide whether it's an issue or not.

casey commented 4 years ago

The question that I'm most curious about is whether or not people are depending on the fact that dependencies of a recipe without interdependencies run in order. I'm guessing that they don't, and most people use explicit dependencies to order recipes that cannot run in parallel.

The reason I'm interested in that is because if people don't rely on this implicit dependency ordering, then command line flags or config options make a lot more sense, since justfiles wouldn't be likely to break if they suddenly ran in parallel.

I definitely agree that command-line flags are less convenient. I think a command-line flag would be good to start with, just as a simple way to prototype the feature.

roblav96 commented 4 years ago

Problem

Concurrently run multiple commands in parallel via one single command definition.

Example

Using npm-run-all --parallel in my package.json

"scripts": {
    "watch": "del dist; npm-run-all --silent --parallel watch:*",
    "watch:nodemon": "wait-for-change dist/index.js && delay 0.1 && nodemon dist/index.js",
    "watch:tsc": "tsc --watch --preserveWatchOutput",
},
jrop commented 3 years ago

For me, parallelized tasks would help speed up my build. Here is a simple case, but in my real-world use-case, I have around 15 modules, forming a complex dependency graph:

a:
  #!/usr/bin/env bash
  cd a
  ./build.sh
b:
  #!/usr/bin/env bash
  cd b
  ./build.sh
c: a b
  #!/usr/bin/env bash
  cd a
  ./build.sh

In this case, I want the build to happen like:

a     b
|     |
 \   /
   v
   c

If a and b build at the same time, that will speed up the build.

mbodmer commented 3 years ago

I agree with @jrop's usecase, but I also have e.g. an all recipe, which depends on configure, build, packaging, deploy recipes. Here I need the sequence in order. But when I deploy to multiple hosts, the deploy recipe for each host could run in parallel.

hartmannr76 commented 3 years ago

Just a thought, wouldn't something like Make's -j flag fit for this? https://www.gnu.org/software/make/manual/make.html#Parallel

Seems to be the way to support it that would be consistent with the idea behind the project

jrop commented 3 years ago

The trick seems to be that sometimes, some will want the tasks to run in series, and sometimes in parallel. It seems to me that some extra syntax would need to be defined for dependencies. Say:

a:
  ..
b:
  ..

c: a > b # where `>` means series
# or
c: a | b # where `|` means parallel
# and if there were "groupings":
c: (a > b) | u | v

I'm not proposing this as the final syntax, but something like this would be useful in a task runner.

madig commented 3 years ago

I was thinking that people wanted selective paralellism, so perhaps you could annotate a recipe to say that its dependencies should run in parallel. So, for example, to run a, b, and c in parallel when running foo:

This is something I'd like to see. I have four recipes that use the same sources but do slightly different things and are independent from one another, so something like foo: a b c d launching the tasks in parallel would be nice.

saskenuba commented 2 years ago

Of course, it is not an ideal solution but works fine for tasks that don't end until manual termination, such as spinning servers.

At my justfile below, when ran default, it opens another terminal with a task that doesn't end, and in parallel, runs my development server. Perhaps this helps someone :smile:

set dotenv-load

default: run-meilisearch watch-jq

run-meilisearch:
    setsid alacritty --working-directory=. -e docker run -it -p 7700:7700 -e "MEILI_MASTER_KEY=$MEILISEARCH_MASTER_KEY" -v data-ms:/app/.data-ms  getmeili/meilisearch:v0.26.0rc0 &

watch:
    ~/.cargo/bin/systemfd --no-pid -s http::5001 -- cargo watch -x run -q | jq

watch-jq:
    @echo Waiting 5 seconds to ensure meilisearch starts
    sleep 5 && setsid alacritty --working-directory=. -e just watch &

release-jq:
    ~/.cargo/bin/systemfd --no-pid -s http::5001 -- cargo run --release | jq
runeimp commented 2 years ago

Couldn't you just do...

a:
    # Long running process

b:
    # Long running process

parallel-sh:
    just a & # runs in the background so errors (probably?) get ignored by this recipe
    just b # runs in the foreground and treated normally regarding errors

parallel-cmd:
    start /b just a
    just b

...in most cases?

My-Machine:best-project-ever account$ just parallel-sh

-or-

C:\Users\account\Projects\Best-Ever> just parralel-cmd

Or is more consistent error handing necessary?

wearpants commented 2 years ago

To add to the use cases, I have daydreams about using just to replace Apache Airflow (a data engineering orchestration tool)

k3d3 commented 2 years ago

Adding in my use case for something like this, I have two commands: one to run a vite (JS) dev server, and one command to run a cargo backend web server.

I'd like one command that runs both at the same time, and kills them both when I hit Ctrl+C.

whyboris commented 1 year ago

It seems like this is already possible. My justfile:

pdf:
  hugo serve -D & sleep 5 && cd pdf && npm start

Starts my Hugo server, and at the same time waits 5 seconds, changes directory, and runs npm start

The secret is & which seems to run things in parallel πŸ€”

k3d3 commented 1 year ago

Good to know! I see a lot of people using & to delegate the task to the shell.

Now, when you run just pdf, does it also stop the hugo server when you Ctrl+C the command?

runeimp commented 1 year ago

Just want to point out that the & thing only works on Unix type shells (Linux, macOS, etc.). On Windows systems this does not work in PowerShell or CMD (Command Prompt).

xavierzwirtz commented 1 year ago

Something similar to the docker compose ux would be nice. Some of the projects I use use docker compose simply as a background task runner because the UX is good. Instead being able to just up -d and launch native processes in the background would be fantastic.

timdp commented 1 year ago

When I first adopted just, I expected dependencies to run in parallel. It bums me out that this is so difficult to achieve.

For the sequential case,

parent: child1 child2
  stuff

is merely syntactic sugar for:

parent:
  just child1
  just child2
  stuff

which really doesn't add all that much. Conversely, getting child1 and child2 to run in parallel involves introducing additional tooling and less readable configuration files. This is strange to me.

Hence, I would argue that just can make a way bigger difference by enabling parallel execution than by solving an already solved problem. There's a big incentive to add itβ€”even for the base case of running all dependencies in parallel, because that alone would already unlock composition of more complex flows.

I also want to add that in the JS ecosystem, a long time ago, task runners like Grunt and Gulp struggled with basically the same challenge.

huyz commented 1 year ago

Has anyone taken a look at Taskfile?

It both runs dependencies in parallel and supports a command line flag to run the specified tasks in parallel: https://taskfile.dev/usage/#task-dependencies

casey commented 1 year ago

@timdp Definitely agree this is important and one of the biggest missing feature! I actually took a crack at this, but ran into weird lifetime/sync/send issues, and the code was really ugly, so I tabled it, but if someone else wants to take a shot, they definitely could. I created a project that does NFTs on Bitcoin called Ordinals, and it's popping off, so my review bandwidth is extremely limited, just a heads up.

syphar commented 1 year ago

I created a draft PR to implement this feature, following the pattern from Taskfile (parallel execution of dependencies, parallel task execution when given multiple tasks on the commandline).

More work to be done, but depends on answers by maintainers.

ravenclaw900 commented 1 year ago

You can get it to run in parallel and stop all processes at the end fairly easily, assuming you're using bash:

dev:
  #!/bin/bash -eux
  cmd1 &
  cmd2 &
  trap 'kill $(jobs -pr)' EXIT
  wait

The wait is necessary to prevent it from just ending after starting both processes. However, when Ctrl+Cing Just, it will force exit the script, stopping both processes.

timdp commented 1 year ago

Yeah, but then you might as well create a scripts folder and do everything in pure Bash. That's what I'm trying to avoid, personally. Just has a real opportunity to improve the experience.

syphar commented 1 year ago

btw, while I don't have any answer from any maintainer yes, #1562 already works for the things I wanted to work.

Sadly cargo install from git doesn't work from this branch, but in any case I would highly appreciate more people testing what I did, and getting feedback.

These things would work with my PR:

1. Run the given recipes on the command line in parallel:

$ just --parallel recipe_1 recipe_2 recipe_3
[...]

2. using the [parallel] attribute, task dependencies are allowed to run in parallel:

recipe_1:
  sleep 1
recipe_2:
  sleep 2
[parallel]
foo: recipe_1 recipe_2
  echo hello

Locally I'm using both ways already.

iovis commented 1 year ago

One workaround working for me if you use tmux is to make it launch in different windows. That way you can also monitor separately:

full:
    tmux new-window 'just server'
    tmux new-window 'just worker'
srid commented 1 year ago

I think what we want is a Procfile like support in justfile, so we don't have to use yet another tool like honcho for it. @syphar Does your PR interleave process output like these Procfile runners do? Does it work for long-running processes?

syphar commented 1 year ago

I think what we want is a Procfile like support in justfile, so we don't have to use yet another tool like honcho for it. @syphar Does your PR interleave process output like these Procfile runners do? Does it work for long-running processes?

Now that is a PR I didn't think about for a long time ;)

From what I remember, it works for long-running processes, and does interleave the output.

A major difference to heroku local -f or probably honcho is that the output isn't prefixed with the process / task, which could be added at a future point in time.

srid commented 1 year ago

From what I remember, it works for long-running processes, and does interleave the output.

Nice.

the output isn't prefixed with the process / task

This would 'seal the deal' and distinguish just greatly as an alternative to all those Procfile-based runners. Looking forward to it! (I'd implement myself if only I had the time for it ...)

Ekleog commented 10 months ago

I'll add one tidbit around this: it'd be awesome if just used the jobserver crate to implement the make jobserver for downstream programs. In particular, it would make just able to parallelize basically all invocations of cargo to exactly the number of cores of the machine, rather than exploding parallelism and spawning more rustc processes than cores :)

gsemet commented 8 months ago

I love the '[parallel]' idea to declare tasks that can be parallelized safely. For instance in CI, I usually perform all checks in // (make checks -j4), but using this special syntax for just, i would do something like just parallel-checks. This would allow me to "continue" with other non-parallelizable tasks for instance.

Something like this chain in a just file could be feasible

stylechecks: style checks

[parallel]
checks: bandit pylint ....

a new parameter --parallel N would still be requires to find how many workers would be started.

My main pbl with make -j is "identifying who really fails in case of error".

hauleth commented 7 months ago

Instead I would prefer that each task is by default independent, so I do not need to write anything extra to run tasks in parallel, and instead there could be an option to mark that some task conflict with another. Because with [parallel] meta attribute that mean that just this task is parallelizable, or all dependencies of this task are parallelizable? What if it is parallelizable, but only when it do not run together with some other task? [parallelizable] mean that this task can run in parallel to others or it mean that this tasks dependencies will be run in parallel?

chaoky commented 7 months ago

I've been using just with concurrently in the mean time, it's pretty good

hauleth commented 7 months ago

Unfortunately concurrently do not fully resolves the problem, as it run only top level tasks in parallel. That will fail if 2 tasks have common dependency that cannot be serialised.

W1M0R commented 5 months ago

Another option might be to introduce additional syntax for dependencies:


# dependencies executed sequentially
tasks: task1 task2 task3

# task1 executes, then task body and then task2 and task 3 (already implemented)
# https://just.systems/man/en/chapter_42.html?highlight=middl#running-recipes-at-the-end-of-a-recipe
tasks: task1 && task2 task3

# execute task1 and task2 in parallel, and when task2 finishes continue with task3
tasks: task1 & task2 task3

# execute task1 and task2 and task3 in parallel
tasks: task1 & task2 & task3

So occurrences of & indicate parallel tasks, similar to the syntax for background jobs in a shell, but with more power (e.g no zombie processes etc).

hauleth commented 5 months ago

@W1M0R I do not understand why the dependant should define order of the tasks to be run. This also introduces problem when for example task1 and task2 depends on task0 - it will be run twice or once?

W1M0R commented 5 months ago

If both depend on task 0 then it should run once.

In this example, the author of the tasks recipe knows that the individual tasks can be executed in parallel without interfering with each other.

There may be other tasks that shouldn't be run in parallel, i.e. one that deletes a folder and another one that creates that folder. The recipe author should get to decide which tasks it wants to have executed in parallel.

W1M0R commented 3 months ago

@theHamsta

For long running tasks that need to run in parallel, I call into the following Taskfile.yaml:

version: '3'

interval: 2s

tasks:

  # Also see: https://taskfile.dev/usage/#watch-tasks
  dev-templ: just dev-templ
  dev-astro: just dev-astro
  dev-go: just dev-go

  dev:
    desc: Run the long-running watches in parallel (Just can't do parallel tasks yet)
    deps: [dev-templ, dev-astro, dev-go] 

The justfile:

dev-up: 
  task dev

Running just dev-up will call task dev. The Taskfile calls back into long-running just recipes, running those recipes in parallel, and stopping them with ctrl+c. It would be great if it wasn't necessary to shell out to another task runner (or tool, such as gnu parallel, watchexec, etc) to accomplish this.

yonas commented 3 months ago

@W1M0R I also like using Goreman for this. Hopefully this will be possible in just soon.

W1M0R commented 3 months ago

Thanks for the tip @yonas