More control around --keep_going

peakschris commented 1 month ago

Description of the bug:

We use --keep_going to collect a full set of errors that can be fixed in one go. --keep_going could be improved to allow this set of issues to be collected better:

Allow a max errors limit to be supplied - after reaching this number of failing targets, stop.

This would allow a good set of failing targets to be collected, whilst not allowing a large monorepo build with some systemic issue to spend a long time failing everything.

Which category does this issue belong to?

Core

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

windows

What is the output of `bazel info release`?

7.3.1

If `bazel info release` returns `development version` or `(@non-git)`, tell us how you built Bazel.

No response

What's the output of `git remote get-url origin; git rev-parse HEAD` ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

iancha1992 commented 1 month ago

@fmeum is this related to https://github.com/bazelbuild/bazel/pull/23828?

fmeum commented 1 month ago

@iancha1992 It's not. I would guess it's Team-Core.

haxorz commented 3 weeks ago

This is qualitatively a reasonable FR. But I could use some help coming up with something that'd be quantitatively useful.

Allow a max errors limit to be supplied - after reaching this number of failing targets, stop... This would allow a good set of failing targets to be collected

In the implementation of Bazel, --keep_going cares about the presence/absence of errors in the internal Skyframe evaluation being done. So the units of consideration aren't user-facing things like "for bazel build T1 T2 did T1 fail to build?" but more like "for bazel build T1 T2 did some f(x) error out?", i.e. arbitrary functions in the guts of Bazel. Bazel is implemented on top of Skyframe such that no errors in these Skyframe function calls implies no user-facing errors for the overall bazel invocation; that's why the current all-or-nothing approach to --keep_going makes sense and is useful.

It would be pretty straightforward to e.g. have --keep_going_until=k cause Bazel to keep going until k Skyframe errors are encountered I just don't know how Bazel users will be able to usefully pick a value of k that is both meaningful and useful.

bazelbuild / bazel