Open postlund opened 2 years ago
I found this issue looking for a clever way to set the number of CPU for make
as well.
My idea was be to have a special {{.CPU}}
value dynamically set by task
.
tasks:
a:
cmds:
- make -j {{.CPU}}
b:
cmds:
- make -j {{.CPU}}
Depending on other commands task
would decide to allow more or less CPUs to each of them.
For instance with 4 CPUs task a
would use them all, but task --parallel a b
could split them 2 and 2.
Hi, I'm also running C++ build systems (meson
) on my CI with task. Since, they already build in parallel, what is the actual benefit of any more parallelization within task? I simply run tasks sequentially doing the equivalent of (instead of deps
):
tasks:
default:
cmds:
- task: a
- task: b
Also, please note that make
specifically has the --load-average
switch, which may help keeping the CPU utilization feasible:
-l [N], --load-average[=N], --max-load[=N]
Don't start multiple jobs unless load is below N.
Hi @MarioSchwalbe, in my case I would have a certain number of tasks that can be parallelize, not only make
tasks.
My goal is to get n_cpu(make) + n_cpu(Task) == N
, that is when make
is not running, I still want to leverage parallelism from Task
(so sequential is no good), but if Task
is doing some things in parallel, I do not want make
to use more than the remaining idle CPUs. Does --load-average
looks at all the job run by the OS, or it only applies to jobs within make?
Hi @MarioSchwalbe, in my case I would have a certain number of tasks that can be parallelize, not only
make
tasks. My goal is to getn_cpu(make) + n_cpu(Task) == N
, that is whenmake
is not running, I still want to leverage parallelism fromTask
(so sequential is no good),
I don't think this is (directly) possible right now. However, the sequential approach I proposed above would only run 1 instance of parallelized make at a time. That means you can combine the task to sequentially run all make
s (called default
above) concurrently with other unparallelized tasks and end up with only 8 processes from make
in 1 task plus 7 processes from other tasks (== 15). This should be much better than the full cross-product (64).
... but if
Task
is doing some things in parallel, I do not wantmake
to use more than the remaining idle CPUs. Does--load-average
looks at all the job run by the OS, or it only applies to jobs within make?
Also not stated explicitly in the manual (https://www.gnu.org/software/make/manual/make.html#Parallel), on UNIX-like systems the load average is a global property taking all processes of all users into account.
There's two possible resolution concepts here:
Refusing to run make multiple times, because it might use too much, means you're dropping some parallelization potential on the floor. It's not ideal, but it's better than bringing the CPU to a grinding halt.
The "correct" solution, however, is https://www.gnu.org/software/make/manual/html_node/Job-Slots.html
This allows the top running process to be in charge, handle -j once, and communicate with tasks such as make that they are not "allowed" to do their own parallel handling, but instead should ask the jobserver for job slots and only use as many job slots as e.g. go-task is willing to hand out.
This is typically used with recursive Make. The user runs make -j8
, and the Makefile recipe internally does stuff like make -C subdir1
and make -C subdir2
, and those subdirs do not have their own -j, but instead share the same 8 jobs that the top make has.
They can each run 4 jobs, but if subdi1 finishes early, then subdir2 should not be capped at 4 jobs. The jobserver means that subdir2 knows it can grow to 8 jobs (or 2 & 6, or 1 & 7, or whatever).
It's possible to control how many tasks that
task
itself runs concurrently via-C
. A problem however is that the command(s) of a task may spawn any number of threads or processes, meaning one task actually represents several "tasks". One example is callingmake -j
which tells make to use one process per available CPU. Assume we have a machine with eight cores and we run eight tasks in parallel, each callingmake -j
, that would mean 64(!) processes running at the same time. I'm hitting a similar situation in my CI environment and it totally tanks my runner.My suggestion is to allow a
weight
value (or similar) for a task which is used to calculate number of running tasks. An simple example:Running
task -C 2
would run eithera
orb
but not both concurrently. I would say that it should be OK to over schedule once, e.g. if taska
above had a weight value of 1, then both would run concurrently even if total weight would be three (but no other task would scheduled until one of them finishes). This could however be a hard vs soft limit setting too.