ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.96k stars 2.55k forks source link

RFC: -j/--jobs for zig subcommands #12101

Open motiejus opened 2 years ago

motiejus commented 2 years ago

Reason: bazel-zig-cc invokes zig cc. Bazel takes care of parallelism itself, so it will assume zig is a "simple" command that uses a single core. It is not, however.

On many-core machines this leads to many parallel processes using many cores. Thus we should use zig cc -j1 (or equivalent) in bazel-zig-cc. Otherwise the number of jobs multiples quickly. (Unfortunately, Bazel neither supports jobserver nor can limit the number of downstream cores via cgroups or similar.)

I have almost implemented -jN, --jobs=N like this:

  1. ThreadPool.init() accepts a new argument jobs: ?usize
  2. If jobs is set, uses that variable for thread_count, otherwise the behavior is as before.

And interesting questions ensued:

  1. Is -jN, --jobs=N a reasonable argument to add to zig cc, zig build-* et al?
  2. This would be the first option that clang does not accept (technically, not in clang/include/clang/Driver/Options.td). Is this OK as a precedent to have zig cc options that are not valid clang options? I can also do this go-style via ZIGMAXPROCS. :)

EDIT: removed a long question about two thread pools in src/test.zig, that was somewhat answered in a commit message where they were introduced.

andrewrk commented 2 years ago

Note that on Linux, ulimit -u 1 can be used to limit the number of processors visible to a child process. I'm not sure if there is a POSIX equivalent, and I understand it may not be possible to use this, or similar, in the Bazel context.

As far as adding a -j command line argument to Zig, I would like to explore other possibilities first. If it's possible to do the right thing in all cases without user participation, then we should aim for that. For example, one idea that I would like to try would be setting the thread priority value on every worker thread to slightly less than normal priority. The main thread (or if all threads are peers, then exactly one worker thread) would be set to normal priority. In theory this would cooperate perfectly with the rest of the system, using available resources when there is nothing else to do, but in the case of a busy system, putting pressure on only one CPU core.

This would be a more efficient solution than a -j style flag because there are many scenarios where the system would not be fully utilized, and running a zig command single-threaded would unnecessarily leave idle processors wasted. For example, if there are 8 cores, and 4 zig cc commands. Or more trivially, the final zig cc command would always exhibit this problem since it would be running alone.

motiejus commented 2 years ago

Note that on Linux, ulimit -u 1 can be used to limit the number of processors visible to a child process. I'm not sure if there is a POSIX equivalent, and I understand it may not be possible to use this, or similar, in the Bazel context.

ulimit -u 1 does not really do what's intended here: it will fail fork(). I somewhat mildly expected it would delay the actual fork until there is a "slot" available.

As far as adding a -j command line argument to Zig, I would like to explore other possibilities first. If it's possible to do the right thing in all cases without user participation, then we should aim for that. For example, one idea that I would like to try would be setting the thread priority value on every worker thread to slightly less than normal priority. The main thread (or if all threads are peers, then exactly one worker thread) would be set to normal priority. In theory this would cooperate perfectly with the rest of the system, using available resources when there is nothing else to do, but in the case of a busy system, putting pressure on only one CPU core.

I spent some time reading about this, at least on Linux. Linux is supposed to do the "right thing", at least on my desktop, with autogroups. It does not, however: when I run tests of bazel-zig-cc, the system becomes unresponsive, instead of allocating as much CPU as required for other autogroup tasks. Well, this requires a more serious investigation.

This would be a more efficient solution than a -j style flag because there are many scenarios where the system would not be fully utilized, and running a zig command single-threaded would unnecessarily leave idle processors wasted. For example, if there are 8 cores, and 4 zig cc commands. Or more trivially, the final zig cc command would always exhibit this problem since it would be running alone.

Understood.

nektro commented 1 year ago

Would love to see this reopened as a tracker. Zig should try to maximize usage except in the case when explicitly told not to and that intervention is not currently possible without this being resolved.

M-Evans commented 1 year ago

I would appreciate a -j flag because I'm performing builds on a raspberry pi and want avoid running too hot. I'm willing to wait a bit longer in order to avoid getting uncomfortably close to the max operating temp for too long.

2023-07-08-211154_1920x337_scrot

Every 0.5s: cat /sys/class/thermal/thermal_zone0/temp                                                                    raspberrypi: Sat Jul  8 21:10:05 2023

75212

note: 75212 == 75*C ref: https://community.element14.com/products/raspberry-pi/b/blog/posts/how-hot-is-too-hot-for-raspberry-pi

The SoC (System on Chip – the integrated circuit that does the Pi’s processing, a Broadcom BCM2837B0) is qualified from -40°C to 85°C.

nektro commented 1 year ago

note that this has been implemented for zig build

edit: here https://github.com/ziglang/zig/commit/cb094700631ea1ae238ea678c192ce4f85fbecc0

whitequark commented 1 day ago

Right now you can't really debug zig cc issues because if you run zig cc -### the output looks like this:

image

whitequark commented 1 day ago

Note that on Linux, ulimit -u 1 can be used to limit the number of processors visible to a child process.

I'm not sure what that's supposed to do, but running it in bash produces something like this:

$ ulimit -u 1
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable

^Cbash: fork: Interrupted system call

bash: wait_for: No record of process 513758
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
^[bash: fork: retry: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable
bash: wait_for: No record of process 513758
bash: fork: retry: Resource temporarily unavailable

Note that on Linux, ulimit -u 1 can be used to limit the number of processors visible to a child process.

There is a misunderstanding over what the command does. It limits the number of processes, not processors:

              -u     The maximum number of processes available to a single user

So it's not suitable for the purpose of limiting the amount of jobs spawned by Zig.

whitequark commented 1 day ago

The workaround is to use something like:

taskset -c 0-$((jobs - 1)) zig cc ...

This affects the result of sched_getaffinity, which is what Zig (and also tools like nproc) use to determine the number of processors to use. (This makes sense because what matters isn't how many CPUs are there in the system, but how many CPUs are there that you can run on.)