Open diegohavenstein opened 2 years ago
cc @oquenchil @joeleba
Note: 5.0.0rc2 is failing as well. With 4.2.1 it works fine
@Wyverald
This should be a release blocker for 5.0, right? Any fix available?
Ping @sdtwigg
This class of failure is known and the current plan is to rollforward with it (since the underlying reason is a cc_test being a dependency of a non-test rule is both rare and odd).
There are a handful of mitigations (in order that I suggest you try them):
--experimental_retain_test_configuration_across_testonly
. Theoretically, this should always work and make it impossible for --trim_test_configuration
to produce action conflicts. In practice, an internal deployment attempt exposed significant Bazel bugs (related to the construction of improper ConfiguredTargetKey) that causes this to fail with action conflicts as well. There is some effort in cleaning these up. (This commit incidentally should have removed some of the surface area for potential failure.)--notrim_test_configuration
for that build.@sdtwigg Thanks! Then this is a breaking change instead of a regression, I'll remove it from release blockers.
Feel free to close this issue if all agree this is not a bug to fix.
Please consider not closing this issue, but fixing this regression.
For us this is neither rare nor odd, but rather affects all our cc_test. Hence the suggested workaround of creating an artificial sh_test is not really an option for us. More so, as we also have to support Windows and want to limit bash dependencies as much as we can.
Let me give you some background why a dependency of a non-test rule is really important for us and why we assume that also other teams are affected by this:
Bazel currently does not support proper .dSYM file creation and has an open issue with this regards: https://github.com/bazelbuild/bazel/issues/6327
Our current workaround (which is completely hermetic) to properly debug cc_binary and cctests in Bazel on Mac is to have an additional .dSYM target for each of our cc* executable targets. These targets are only compatible with mac and have a dependency on the actual tester or binary and use an aspect to determine the static libraries and object files that were initially used to create the corresponding executables.
Therefore, we believe that this is an actual bug that should be fixed and also should not be dismissed as an "odd" use case.
@rbinnatableau How about using 2. and 3. as temporary workarounds until #6327 is properly fixed?
Thanks for pointing this out, we tested 2 and 3 seem to be useful workarounds (however having to use an experimental flag feels somehow unsatisfying).
Just out of curiosity, could it be a fix to the problem that the test-related options are not taken into considerations for the non test targets and hence such dependencies be allowed again?
Just out of curiosity, could it be a fix to the problem that the test-related options are not taken into considerations for the non test targets and hence such dependencies be allowed again?
@sdtwigg
Note that --notrim_test_configuration
isn't experimental. I'd expect that to work even without 2., although it's nice if 2 is sufficient.
Just out of curiosity, could it be a fix to the problem that the test-related options are not taken into considerations for the non test targets and hence such dependencies be allowed again?
I don't quite follow this question. Can you clarify?
More broadly, I appreciate the concerns you're expressing. But this isn't a case of a bug. It's a more challenging case of contradictory user needs.
--trim_test_configuration
breaks your build structure by definition, not by accident. That's because that's behavior most users want: most tests don't have non-tests depending on them. If we propagate test flags all the way down the build, then every time someone repeats a test with different test parameters the entire build graph is reconstructed, even though 99% of it is exactly the same as before. Net result is painfully slow interactive testing.
Your use case is one of the rare ones where this optimization isn't safe. If we roll back the change, then we'll improve your use case but at the expense of the above. We ultimately have to balance these conflicting needs to find the right, most broadly helpful default.
Sometimes people have non-tests depending on tests for no important reason. In those cases they should simply refactor their builds. In yours, where you state a conceptual foundation for this structure, it makes sense to discuss your options and how the workaround can help. But we want to try as hard as we can not to hold the --trim_test_configuration
benefit back from common use cases.
I don't believe we have any intention of outright removing --trim_test_configuration
until we know these complications no longer exist. So I'd hope in a worst case where there's really no alternative structure for you it's reasonable to simply set --notrim_test_configuration
in your bazelrcs
.
Just my comments: @sdtwigg remains the source for definitive guidance.
@gregestren thank you for explaining the background. This is really helpful. I can totally understand that you don't want to sacrifice the improvements that impact a large number of users. My concern here is that it is not a rare case.
In the meantime we have identified two more use cases in our code base that is affected by --trim_test_configuration
(so I expect there will be many more use cases out there):
In addition, we assume that also all other kind of static analysis, linting tools and packaging targets that depend on test targets are affected by this.
If I understand you correctly the proposed solution on your side is to do the following for every non test dependency:
Imho this lead to a scenario where code bases that implemented one of the following use cases are affected:
In such code bases in the worst case all _test targets will be replaced by _binary targets + additional sh_test like targets.
As this workaround results in a lot of boiler plate code and indirections, I am wondering if there can be another solution that achieves both:
I gather from your previous comment that the way the test flags relate to caching is the reason for the performance improvements achieved by --trim_test_configuration
.
As far as I understand the test flags consist of two parts:
My intuition is that the test specific flags only impact the actual execution of the tests, if it is executed by the test harness and not if they are executed by a run
command or another action as part of the build.
Therefore, I don't understand why test flags invalidate subsequent targets, if they do do not depend on the test execution by the test harness (I am quite sure I must be missing something obvious here). Could it therefore be a solution to not die the "test execution specific flags" to the build actions, but instead to the underlying action that executes the test with the test harness?
Seems related to #14236
@diegohavenstein @rbinnatableau
Does cherry-picking 3a1795ea303ebcc5eaec2ada5ab04a160b57de1d and adding the --experimental_shareable_cpp_compile_actions
flag work for your case?
It seems to fix the class of compile action conflicts where the conflicting actions has the same action key.
Therefore, I don't understand why test flags invalidate subsequent targets, if they do do not depend on the test execution by the test harness (I am quite sure I must be missing something obvious here). Could it therefore be a solution to not die the "test execution specific flags" to the build actions, but instead to the underlying action that executes the test with the test harness?
I have to go a little into Bazel internals to explain why this doesn't quite work. The test flags are currently under the subset of Bazel build options that are part of the build configuration that is passed along to all rules in the build graph (and potentially transitioned e.g. when cfg=exec is used to get targets built for a specific execution platform). The reason why test flags are there is that the test actions are actually being injected as part of that specific rule's implementation during analysis (essentially, right after the main implementation is run, Starlark-defined or otherwise). There isn't really a way to only partially re-run a specific rule's implementation and instead the rule's entire implementation has to be re-run when the build configuration attached to it changes. --trim_test_configuration
tries to avoid the issue by not attaching the test options to non-test rules under the assumption they will not need it.
As a first aside, it was at one point explored to completely remove the test flags from the build configuration and have the top-level re-build test actions as-needed. There are no known hard technical blockers here that would have made this impossible except that this was requiring a bit too much rewriting of top-level code to allow this assumption. (Essentially, far too many disparate places, particularly build event streaming, in top-level were relying on a rule's action always being contained in rule analysis that this approach was not tenable at the time it applied.)
As a second aside, the build configuration attached to a rule also needs to also include all the configuration needed by transitive dependencies, even if the current rule doesn't need it. Interestingly, this is not why a non-test rule depending on a test rule is a problem. The test rule is already coded to 'skip' preparing a test action if the test options are missing (as it is assumed in this case the test rule is being analyzed inside the graph, not at top-level, and thus the test action is useless). The action conflict problem occurs when that test rule is analyzed both at the top-level with test options attached and inside the graph without test options attached. In that case two actions are created trying to build the same output file and, this is bad for cc actions which are assumed to be unsharable (excluding the very new --experimental_shareable_cpp_compile_actions
flag mentioned by @torgil). (Note that by design we do not change the output directory/name of the output files based on the test flags or if the test flags are trimmed. This is unlike how we DO e.g. change the output directory based on the target platform in order to avoid action conflicts.)
Thanks for pointing this out, we tested 2 and 3 seem to be useful workarounds (however having to use an experimental flag feels somehow unsatisfying).
How problematic is the experimental part of the flag? I am highly confident that using it for your cases is the best call here.
Part of the reason why the flag is marked as experimental (and kind of in limbo) is that it is not immediately obvious what the best default value for the flag is. The tradeoff is better incrementality performance for all non-test targets (if flag is false else all non-testonly targets) versus correctness in avoiding potential action conflicts for the somewhat rare case users have unsharable test targets (usually cc_test) depended on by non-test targets. Due to the particularities of this flag, it would ideally only be used by users that truly need it.
We could set this flag to true by default and have the experimental part be setting it to false (i.e. try to get better performance at risk of correctness). There is a (hopefully minor caveat) that the flag can still cause correctness issue due to an internal Blaze bug as mentioned above; however, those are actually unintentional bugs to be fixed and not intentional breaking changes.
I am getting this error trying bazel test //...
simply because of the presence of a filegroup
like this.
cc_test(
name = "hello-cc",
srcs = ["hello_test.cc"],
)
filegroup(
name = "hello",
testonly = True,
srcs = [
"//:hello-cc",
],
)
Remove the filegroup and the error goes away.
% bazel --version
bazel 5.0.0-homebrew
This is expected because filegroup is also a non-test rule.
Edit: What are you actually trying to do? Because just putting cc_test rules in a filegroup is a tad peculiar. Note that test_suite can be used to group tests together for execution all at once under an alias.
Needing filegroup would be more about pulling out the actual underlying test binaries (which exist and can be done but doing so is definitely more advanced behavior than usual). Often, the non-test can be replaced by a proper, custom test rule that does something with those binaries.
Just though I'd give an example showing how easy it is to hit this issue, especially with a non-executable (iirc) target in the graph.
I am slightly curious that testonly = True
doesn't help in that example.
Reviewing my older notes on this, there are a few experimental flags that may help here:
--experimental_retain_test_configuration_across_testonly
will make it so test configuration is not dropped for testonly targets. This will likely resolve the correctness issue at the expense of performance (since the domain of testonly can be rather deep under some test rules). This is not on globally because there are bugs in AspectFunction that are potentially triggered by the delayed trimming of the test configuration.
--experimental_shareable_cpp_compile_actions
makes c++ actions sharable and thus should avoid the action conflict in this case. I am unsure what the progress of migrating it to default on is.
That said, having tests under non-test rules is very abnormal and can usually be replaced by using test_suite, or in the more advanced cases (e.g. where you want to run the test binary as a subpart of a different test) having the test rules directly under another test rule (e.g. in the original example, set test=True
and executable=True
for the dummy rule.
Yes your earlier post was helpful.
(e.g. in the original example, set
test=True
andexecutable=True
for the dummy rule.
Is test
a typo there? I don't believe these args are available to filegroup() and its own doc simply says:
Use filegroup to give a convenient name to a collection of targets. These can then be referenced from other rules.
That sounds convenient enough to use anywhere and without restrictions. Perhaps a note about this issue can be added to the docs?
I was referring to the dummy rule in the original post (https://github.com/bazelbuild/bazel/issues/14294#issue-1057216847). Sorry for the confusion.
That said, having tests under non-test rules is very abnormal and can usually be replaced by using test_suite
One use-case is a single bazel target producing a summarized code quality report for a larger codebase without the need to collect lots of files and run scripts outside the build tree.
That said, it's not necessarily the test results from "bazel test" or "bazel coverage" that is needed. It's easy to depend on test results with Bazel today using a script or a tool that runs the test and saves the result.
excluding the very new --experimental_shareable_cpp_compile_actions flag mentioned by @torgil
Made a pull request on this here #16863
Is there a flag we can enable to make the build fail if a target depends on a test target with a helpful error message so users aren't confused when they see this message?
This appears to be resolved in Bazel 6 (tested on 6.2.0). At least for the Java tests I attempted to reproduce the issue with.
🤔 This issue only mentions the problem in the context of CC though... Quite possible a change to specific to Java rules resolved the issue there, but worth retesting with the repro steps for CC on Bazel 6.2.0 to be sure.
Description of the problem / feature request:
When using 5.0.0-pre.20211011.2, calling a rule with testonly=True fails with the following error (see repro):
Feature requests: what underlying problem are you trying to solve with this feature?
Trying out new Bazel version on existing, large codebase
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
In an empty dir, create empty WORKSPACE file, and the following files:
What operating system are you running Bazel on?
MacOS BigSur
What's the output of
bazel info release
?release 5.0.0-pre.20211011.2
If
bazel info release
returns "development version" or "(@non-git)", tell us how you built Bazel.Downloaded from Releases page
What's the output of
git remote get-url origin ; git rev-parse master ; git rev-parse HEAD
?Does not apply
Have you found anything relevant by searching the web?
No, found no similar issues reported
Any other information, logs, or outputs that you want to share?
You can reach out to me directly on Slack if I can help out with the repro