Weighted tasks for better concurrency control

postlund commented 2 years ago

It's possible to control how many tasks that task itself runs concurrently via -C. A problem however is that the command(s) of a task may spawn any number of threads or processes, meaning one task actually represents several "tasks". One example is calling make -j which tells make to use one process per available CPU. Assume we have a machine with eight cores and we run eight tasks in parallel, each calling make -j, that would mean 64(!) processes running at the same time. I'm hitting a similar situation in my CI environment and it totally tanks my runner.

My suggestion is to allow a weight value (or similar) for a task which is used to calculate number of running tasks. An simple example:

version: "3"

tasks:
  a:
    weight: 2
    cmds:
      - make -j

  b:
    weight: 2
    cmds:
      - make -j

  default:
    deps: [a, b]

Running task -C 2 would run either a or b but not both concurrently. I would say that it should be OK to over schedule once, e.g. if task a above had a weight value of 1, then both would run concurrently even if total weight would be three (but no other task would scheduled until one of them finishes). This could however be a hard vs soft limit setting too.

AntoinePrv commented 2 years ago

I found this issue looking for a clever way to set the number of CPU for make as well. My idea was be to have a special {{.CPU}} value dynamically set by task.

tasks:
  a:
    cmds:
      - make -j {{.CPU}}

 b:
    cmds:
      - make -j {{.CPU}}

Depending on other commands task would decide to allow more or less CPUs to each of them. For instance with 4 CPUs task a would use them all, but task --parallel a b could split them 2 and 2.

MarioSchwalbe commented 2 years ago

Hi, I'm also running C++ build systems (meson) on my CI with task. Since, they already build in parallel, what is the actual benefit of any more parallelization within task? I simply run tasks sequentially doing the equivalent of (instead of deps):

tasks:
  default:
    cmds:
      - task: a
      - task: b

Also, please note that make specifically has the --load-average switch, which may help keeping the CPU utilization feasible:

  -l [N], --load-average[=N], --max-load[=N]
                              Don't start multiple jobs unless load is below N.

AntoinePrv commented 2 years ago

Hi @MarioSchwalbe, in my case I would have a certain number of tasks that can be parallelize, not only make tasks. My goal is to get n_cpu(make) + n_cpu(Task) == N, that is when make is not running, I still want to leverage parallelism from Task (so sequential is no good), but if Task is doing some things in parallel, I do not want make to use more than the remaining idle CPUs. Does --load-average looks at all the job run by the OS, or it only applies to jobs within make?

MarioSchwalbe commented 2 years ago

Hi @MarioSchwalbe, in my case I would have a certain number of tasks that can be parallelize, not only make tasks. My goal is to get n_cpu(make) + n_cpu(Task) == N, that is when make is not running, I still want to leverage parallelism from Task (so sequential is no good),

I don't think this is (directly) possible right now. However, the sequential approach I proposed above would only run 1 instance of parallelized make at a time. That means you can combine the task to sequentially run all makes (called default above) concurrently with other unparallelized tasks and end up with only 8 processes from make in 1 task plus 7 processes from other tasks (== 15). This should be much better than the full cross-product (64).

... but if Task is doing some things in parallel, I do not want make to use more than the remaining idle CPUs. Does --load-average looks at all the job run by the OS, or it only applies to jobs within make?

Also not stated explicitly in the manual (https://www.gnu.org/software/make/manual/make.html#Parallel), on UNIX-like systems the load average is a global property taking all processes of all users into account.

eli-schwartz commented 2 years ago

There's two possible resolution concepts here:

in cases where you might parallelize too much, refuse to parallelize
parallelize exactly enough, and then personally take responsibility for scheduling all jobs

Refusing to run make multiple times, because it might use too much, means you're dropping some parallelization potential on the floor. It's not ideal, but it's better than bringing the CPU to a grinding halt.

The "correct" solution, however, is https://www.gnu.org/software/make/manual/html_node/Job-Slots.html

This allows the top running process to be in charge, handle -j once, and communicate with tasks such as make that they are not "allowed" to do their own parallel handling, but instead should ask the jobserver for job slots and only use as many job slots as e.g. go-task is willing to hand out.

This is typically used with recursive Make. The user runs make -j8, and the Makefile recipe internally does stuff like make -C subdir1 and make -C subdir2, and those subdirs do not have their own -j, but instead share the same 8 jobs that the top make has.

They can each run 4 jobs, but if subdi1 finishes early, then subdir2 should not be capped at 4 jobs. The jobserver means that subdir2 knows it can grow to 8 jobs (or 2 & 6, or 1 & 7, or whatever).

go-task / task

Weighted tasks for better concurrency control #716