Open criemen opened 1 week ago
Stack traces and the profile to me look indistinguishable from a build that just waits for long-running actions to finish (but of course they are taking very long).
If possible, could you try to bisect this down to a particular rolling release or even commit? Bazelisk accepts individual Bazel commits.
@bazel-io flag
Okay we're getting somewhere: disabling --experimental_collect_worker_data_in_profiler
stops the hangs from occurring (so this might not be a release blocker after all). We also had this enabled on 7.3/7.4, but it might be that the option is just silently ignored on those branches?
I got hangs back to (at least) 8.0.0-pre20240516.1, then in my manual bisecting I switched to an older version that didn't have the flag,
Enabling that flag by default was reverted, due to flakiness in the multiplex_worker tests in https://github.com/bazelbuild/bazel/commit/a9525c701125664bb9daf5637084e85dff186d31
Unfortunately, there's no PRs or external history associated with this flag.
It didn't do anything on Windows before the revert: https://github.com/bazelbuild/bazel/commit/a9525c701125664bb9daf5637084e85dff186d31#diff-b572d41bff84fa61b397e97467a898b32baf118421a0b06859e3fa04c556a7ebL219
I don't know how it works, but maybe this if
should be brought back?
@bazel-io fork 8.0.0
Description of the bug:
When upgrading to bazel 8 (from a pre-release of bazel 7.4.0), we're observing hangs of bazel when building our codebase on Windows. The hangs happen both on CI and locally, but don't seem to be 100% reproducible.
I've attached a bazel profile, compact execution log, and jstack traces of the two relevant (I believe) java processes for the build. Let me know if I can support you with more debug information.
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I've not been able to reproduce this on our public codebase, and will investigate further reductions only if the current debug information isn't sufficient.
Which operating system are you running Bazel on?
Windows 11
What is the output of
bazel info release
?No response
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse HEAD
?No response
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
hang-debugging.zip