bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.99k stars 4.03k forks source link

`cc_test` should respect `exec_compatible_with` for test action #23202

Open keith opened 1 month ago

keith commented 1 month ago

Description of the bug:

If you have a simple constraint like this:

constraint_setting(name = "something")

constraint_value(
    name = "has_something",
    constraint_setting = ":something",
)

And you create a cc_test target that requires that setting:

cc_test(
    name = "test",
    srcs = ["test.cc"],
    exec_compatible_with = ["//:has_something"],
    target_compatible_with = ["//:has_something"],
)

and you're using remote exec with multiple platforms:

platform(
    name = "base-platform",
    constraint_values = [
        "@platforms//cpu:aarch64",
        "@platforms//os:linux",
    ],
)

platform(
    name = "something-platform",
    constraint_values = [
        "@platforms//cpu:x86_64",
        "@platforms//os:linux",
        ":has_something",
    ],
)

with:

build:remote --platforms=//:something-platform
build:remote --extra_execution_platforms=//:base-platform,//:something-platform

Building the cc_test target results in compilation happening on the something-platform, linking happening on base-platform and the test action most importantly happening on the base-platform.

It seems like theoretically this flag exists to help with this:

--experimental_add_exec_constraints_to_targets='//:test'=//:has_something

But this doesn't help in this case (unsurprisingly I think since you've already manually annotated the target with exec_compatible_with)

Passing --use_target_platform_for_tests does appear to fix this, but this flag has the downside that if you were to run 2 tests at once where not all required :has_something, they would both run on the something-platform instead of the base-platform, which is undesirable.

If you create a py_test (picked at random as another option) and create it the same way, it works as I'm expecting.

I assume this is because cc_test has exec_groups defined https://github.com/bazelbuild/bazel/blob/c71f0b6ded1de926bd1081fd4327142777711b03/src/main/starlark/builtins_bzl/common/cc/cc_test.bzl#L112-L116, but I would still expect a manually passed exec_compatible_with to override these?

If I delete the test exec_group (and deal with the fallout of that) it fixes the issue. I guess I would assume that any exec_compatible_with would be merged with the exec_group value (which isn't even passed in this case)?

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

No response

What is the output of bazel info release?

1acbd42c83b6cb4ac682a8362f08fdc565b50906 (or 7.x)

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

There are many threads on related topics here. Most about splitting the compilation / linking / testing across multiple exec groups, which I would also like for other cases. But I think this case is distinct.

Any other information, logs, or outputs that you want to share?

No response

keith commented 1 month ago

Here's a failing test case demonstrating this issue: https://github.com/bazelbuild/bazel/pull/23203

If i remove all exec_groups from cc_test it passes, which is what I expected behavior wise, that the passed exec_compatible_with / target_compatible_with would override / be combined with any from the default exec_groups.

Note that passing --use_target_platform_for_tests doesn't fix the test case either since the way it's currently configured nothing should be executed on platform_a just to make the assertions easier

sluongng commented 1 month ago

cc: @katre

fmeum commented 1 month ago

If your use case really doesn't need to specify exec_compatible_with on a per-target level, you could register a toolchain with the required (global) constraint and a bit of undocumented logic as @bazel_tools//tools/cpp:test_runner_toolchain_type. (Edit: See this draft version of a guide: https://github.com/trybka/scraps/blob/master/cc_test.md)

Adding exec constrains per-target is still a use case that should be supported. I do think that having exec_compatible_with add to all exec groups would not be the correct solution though: If you have a (hypothetical) cc_test target that compiles 1M lines of GPU DSL and thus requires compilation to run on a machine with fast CPU cores, you wouldn't want the //:fast_cpu constraint to be picked up by the test action group, simply because you may not have machines that satisfy both //:fast_cpu and //:has_gpu. Similar behavior is already problematic with --incompatible_allow_tags_propagation affecting both test and compilation actions (I know, it was me who flipped it...).

Instead, it would seem more flexible to be able to add constraints to individual exec groups. Since Bazel doesn't offer an attribute type for dict[str, list[Label]], we can't just have this:

cc_test(
    ...,
    exec_compatible_with = [
        "//:fast_cpu",
    ],
    exec_group_exec_compatible_with = {
        "test": ["//:has_gpu"],
    },
)

But what we could probably have is this, where each exec group declared on the rule results in a new <exec group>_exec_compatible_with attribute added to the rule:

cc_test(
    ...,
    # Applies to the default (unspecified) exec group only.
    exec_compatible_with = [
        "//:fast_cpu",
    ],
    test_exec_compatible_with = [
        "//:has_gpu",
    ],
)
keith commented 1 month ago

Seems like doing that toolchain setup would potentially work, given the lack of docs and public usages I was a bit wary when I saw that might be an option.

I agree that merging the exec_compatible_with with all exec_groups isn't ideal, and your suggested solution is the ideal, but also I think merging would still be preferred since you can't control those exec groups today without modifying the bazel source either AFAIUI?

Another option as a workaround is to stop using cc_test and replace it with a cc_binary and a custom test rule that just runs that binary as the test. Here's a minimal example:

def _binary_test_impl(ctx):
    output = ctx.actions.declare_file(ctx.label.name)
    ctx.actions.symlink(
        target_file = ctx.file.binary,
        output = output,
    )

    return [
        DefaultInfo(
            executable = output,
            runfiles = ctx.attr.binary[DefaultInfo].default_runfiles,
        ),
        ctx.attr.binary[RunEnvironmentInfo],
    ]

binary_test = rule(
    implementation = _binary_test_impl,
    attrs = {
        "binary": attr.label(allow_single_file = True),
    },
    test = True,
)