bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.21k stars 4.06k forks source link

`Create symlink tree out-of-process` is only limited by `--jobs` #21594

Open brentleyjones opened 8 months ago

brentleyjones commented 8 months ago

Description of the bug:

With a high --jobs value, multiple Create symlink tree out-of-process can starve the system of resources, resulting in much longer builds.

Ideally this should be limited to HOST_CPUS, --local_cpu_resources, or a new disk related resource.

Which category does this issue belong to?

Local Execution, Performance

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Have a resource constrained machine (possibly a VM), set a high --jobs value, have lots of targets that have runfiles (which will then Create symlink tree out-of-process), and then build. You will see that symlink tree creation that usually takes a couple seconds max can now take over a 1000 seconds:

CleanShot 2024-03-06 at 09 09 59@2x

Which operating system are you running Bazel on?

macOS

What is the output of bazel info release?

release 7.0.2

Any other information, logs, or outputs that you want to share?

I know that in the future Loom will decouple --jobs from the number of maximum downloads or remote actions performed, which could sidestep this issue for us. But I'm not sure we need to wait for that to fix this issue.

tjgq commented 8 months ago

FWIW, I think a "maximum number of concurrent I/O bound tasks" parameter (regardless of whether it's a flag or an automatically determined value) could be useful in other contexts. For example, when reading/writing a bunch of files in parallel from/to a disk cache, we currently schedule the work onto a --jobs-sized pool, which will likely suffer a similar fate if --jobs is too high, and be unreasonably slow if --jobs is too low.

meisterT commented 8 months ago

cc @coeuvre