Open brentleyjones opened 8 months ago
FWIW, I think a "maximum number of concurrent I/O bound tasks" parameter (regardless of whether it's a flag or an automatically determined value) could be useful in other contexts. For example, when reading/writing a bunch of files in parallel from/to a disk cache, we currently schedule the work onto a --jobs
-sized pool, which will likely suffer a similar fate if --jobs
is too high, and be unreasonably slow if --jobs
is too low.
cc @coeuvre
Description of the bug:
With a high
--jobs
value, multipleCreate symlink tree out-of-process
can starve the system of resources, resulting in much longer builds.Ideally this should be limited to
HOST_CPUS
,--local_cpu_resources
, or a new disk related resource.Which category does this issue belong to?
Local Execution, Performance
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Have a resource constrained machine (possibly a VM), set a high
--jobs
value, have lots of targets that have runfiles (which will thenCreate symlink tree out-of-process
), and then build. You will see that symlink tree creation that usually takes a couple seconds max can now take over a 1000 seconds:Which operating system are you running Bazel on?
macOS
What is the output of
bazel info release
?release 7.0.2
Any other information, logs, or outputs that you want to share?
I know that in the future Loom will decouple
--jobs
from the number of maximum downloads or remote actions performed, which could sidestep this issue for us. But I'm not sure we need to wait for that to fix this issue.