bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.2k stars 4.06k forks source link

incompatible_no_rule_outputs_param: disable outputs param of rule function #7977

Open c-parsons opened 5 years ago

c-parsons commented 5 years ago

This flag disables the outputs parameter on the rule() function.

Flag: --incompatible_no_rule_outputs_param

Motivation Output Map Madness outlines some problems and inconsistencies in using this pattern, and there are also a couple of github issues (#2883 and #5581) on this matter.

Additional details in #6241

Migration Use OutputGroupInfo specify output files instead.

For example, replace:

def _my_rule_impl(ctx):
    ...

my_rule = rule(
    implementation = _my_rule_impl,
    outputs = {“bin”: “%{name}.exe”},
)

with:

def _rule_impl(ctx):
    ...
    bin_name = ctx.attr.name + “.exe”
    bin_output = ctx.actions.declare_file(bin_name)
    # bin_output will, as before, need to be the output of an action
    ...
    return [OutputGroupInfo(bin = depset([bin_output])]

my_rule = rule(
    implementation = _my_rule_impl,
)

Note that this migration approach does not automatically let you specify these outputs by label. For example, you will notice that, with this exact approach, you cannot specify the ".exe"-suffix as its own label. In order to preserve this implicit-output functionality, you will need to use attr.output with a macro-specified default. For example:

def _rule_impl(ctx):
    ...
    bin_output = ctx.outputs.bin_output
    # bin_output will, as before, need to be the output of an action
    ...
    return [OutputGroupInfo(bin = depset([bin_output])]

_my_rule = rule(
    implementation = _my_rule_impl,
    attrs = {
       "bin_output" : attr.output(),
    }
)

def my_rule(name, **kwargs):
    my_rule(name = name,
        bin_output = name + ".exe",
        **kwargs)
c-parsons commented 5 years ago

I've spoken to authors of rules_apple (@thomasvl @allevato @sergiocampama) who have established some pains with respect to this change. These are issues we should work to resolve before proceeding with this migration.

  1. It's unfortunate and a pain to make rule-name consistent with macro name. The only way is to define the rule in another file, export it as a different name, and call that from the macro. That is:

my_rule_private.bzl:

my_rule = rule(...)

my_rule.bzl:

load(':my_rule_private.bzl', _my_rule="my_rule")
def my_rule(name, **kwargs):
    _my_rule(name = name, kwargs)

Loading an additional bzl file has performance implications, and this pattern requires annoying separation between related bits of code.

Possible workaround: One workaround would be if we allow rules to define their proper name to be different than their exported name. For example:

_my_rule = rule(exported_name = "my_rule", ...)

def my_rule(name, **kwargs):
    _my_rule(name = name, kwargs)
  1. This requires writing documentation in parallel for a macro and the rule it requires. Stardoc should look to the rule documentation and no macro documentation should be required! However, skylint aggressively indicates these macros must be documented. (Except, it may be best if one were able to document default values of output attributes alongside the macro which controls it -- but this should merely enhance the existing rule documentation, not require duplicating the rule's docs!)

This seems a high priority feature for Stardoc, and perhaps a blocker for this migration.

For completeness, we also discussed a question I have been asked before regarding this migration: Q: Aren't macros a thing we want to avoid? Why is the recommendation to use a macro to set output values? A: No, Starlark team discourages complex macros that users can depend on. If there is a 1:1 correspondence between macro and subsequent rule, the macro should be OK for users to depend on. That is all this migration plan suggests.

allevato commented 5 years ago

Aside: rules_apple/rules_swift can look into using output groups as well (which would avoid the need for a macro), but those output groups can't be referenced directly by other rules (i.e., there's no equivalent to :Foo.ipa), right?

c-parsons commented 5 years ago

That's correct -- the only way to make optional outputs specifiable by label is to use attr.output or attr.output_list, whose values must be set by macro.

FWIW, the correct approach is to use both output groups and attr.output. The former allows a user to build Foo.ipa by specifying:

bazel build //my:Foo --output_groups=ipa

and the latter allows building Foo.ipa by specifying:

bazel build //my:Foo.ipa

(or similarly a dependency on //my:Foo.ipa from another target)

sergiocampama commented 5 years ago

Looks like this is incompatible with rule transitions, I get the following error:

java.lang.RuntimeException: Unrecoverable error while evaluating node X:X.out BuildConfigurationValue.Key[ee7a6eae9f289e48bf8a1e925f4979c1] false' (requested by nodes )
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:528)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:399)
    at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1396)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:283)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.localPopAndExec(ForkJoinPool.java:950)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1607)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: com.google.common.base.VerifyException: Could not find prerequisite X:X in the expected configuration
    at com.google.common.base.Verify.verify(Verify.java:124)
    at com.google.common.base.Verify.verifyNotNull(Verify.java:500)
    at com.google.devtools.build.lib.analysis.TargetContext.findDirectPrerequisite(TargetContext.java:113)
    at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createConfiguredTarget(ConfiguredTargetFactory.java:213)
    at com.google.devtools.build.lib.skyframe.SkyframeBuildView.createConfiguredTarget(SkyframeBuildView.java:774)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.createConfiguredTarget(ConfiguredTargetFunction.java:891)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.compute(ConfiguredTargetFunction.java:351)
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:451)

Granted, this is also an issue with implicit outputs, just wanted to bring this into attention in case the solution is different.

shahms commented 5 years ago

Without the ability to manually specify the rule kind, encouraging the macro-based solution is going further obscure the fact that the left-hand-side of the assignment from the return value of rule is part of the public API surface of a rule. We already have enough of a problem with this as it is (to say nothing of mnemonic) and I think the current direction in this bug will make it worse.

aiuto commented 5 years ago

+1 to @shahms coment. Using the macro to wrap the rule needlessly obscures the true rule.. It also makes it harder to automatically generate documentation, because now we have a forwarder in the middle. It would not so bad if we still had default on attr.output, but that went away.

alandonovan commented 3 years ago

Chris, could you update the status, and the priority if this is not active work? Thanks.

c-parsons commented 3 years ago

Indeed, this has been deprioritized but this bug was not correctly updated to reflect the lower priority. My bad.

I frankly haven't looked at this in quite some time. I think my previous comment is the best summary of this bug.

camillol commented 3 years ago

As a user of rules, the real problem I have with outputs is that there are outputs I cannot reference directly. Unfortunately, this change risks encouraging this poor behavior exactly where it's most vexing.

The rules that have the greatest need for declaring outputs with patterned names are complex rules that output files of different types (e.g. debug info, source maps, loaders etc.). Because these files have different purposes, it’s important to be able to pick the one you want instead of grabbing the whole bag.

This use case is well-served by rule.outputs: you declare your outputs, name them with patterns, and they are “made available as labels” (user POV: they work).

The change proposed here means that even if rules declare their outputs, they “won’t be made available as labels” (user POV: they’re broken, and I need to use a genrule to make a copy of the file I want, which will then work). To get working outputs rule authors will need to go out of their way and write a macro, which they won’t do until thirty users run into the issue and give up, twenty run into it and hack a workaround, and the fifty-first files a bug, at which point the macro may be added. Rinse and repeat for each new rule.

BTW, I took the user POV here, but in reality I was also the author of the rule I was having an issue with. Both of me agree that we want declared labeled outputs in the rule.

brandjon commented 3 years ago

Update: We'd still like to remove implicit outputs, but at the moment don't have the bandwidth to reassess the alternatives users would migrate to. We won't actually flip the flag until we have time to address that.

pauldraper commented 2 years ago

Is it possible to get some more attention on this?

Huge numbers of projects (e.g. rules_docker, rules_nodejs, I believe many others) are using this feature extensively. There's no clear guidance on why it's deprecated or what they should be doing instead.

It's uncomfortable to build lots of code around deprecated behavior, but these projects have found it so useful, they do it anyway.

aiuto commented 2 years ago

Going back to

The rules that have the greatest need for declaring outputs with patterned names are complex rules that output files of different types (e.g. debug info, source maps, loaders etc.). Because these files have different purposes, it’s important to be able to pick the one you want instead of grabbing the whole bag.

Let's do an example. Let's assume we have a compiler that produces a debug file, but the name should be based on a value we get from the toolchain. We can not know this until Analysis time, so the current outputs functionality does not even work. Let's call that the 'debug' output.

The way I approach this is:

We have an actual case in rules_pkg. The pkg_deb rule emits the .deb file as the primary output, and the .changes file name is calculated from that. I might right something like:

pkg_deb(name = "deb", ...)

filegroup(
    name = "the_changes_file",
    srcs = [":deb"],
    output_group = "changes",
)

genrule(
    name = "use_changes_file",
    srcs = [":the_changes_file"],
    outs = ["copy_of_changes.txt"],
    cmd = "cp $(location :the_changes_file) $@",
)

This might appear a little clunky, but often the filegroups can be created by a macro wrapper around the rule.

pauldraper commented 2 years ago

Output groups work, though they are definitely less ergonomic than implicit outputs.

Let's assume we have a compiler that produces a debug file, but the name should be based on a value we get from the toolchain.

Is that....is that a real example?

In any cases, yes output groups do work, though with a bit more awkwardness.


For context, a real-world example is rules_docker's container_image:

The implicit output targets are:

[name].tar: A full Docker image containing all the layers, identical to what docker save would return. This is only generated on demand.

[name].digest: An image digest that can be used to refer to that image. Unlike tags, digest references are immutable i.e. always refer to the same content.

[name]-layer.tar: A Docker image containing only the layer corresponding to that target. It is used for incremental loading of the layer.

Note: this target is not suitable for direct consumption. It is used for incremental loading and non-docker rules should depend on the Docker image ([name].tar) instead.

[name]: The incremental image loader. It will load only changed layers inside the Docker registry.

https://github.com/bazelbuild/rules_docker/blob/master/docs/container.md#container_image

Using deprecated features, tsk, tsk.

aiuto commented 2 years ago

Output groups work, though they are definitely less ergonomic than implicit outputs.

Let's assume we have a compiler that produces a debug file, but the name should be based on a value we get from the toolchain.

Is that....is that a real example?

Not the compiler, but the example is real for pkg_deb. The Debian standard for naming packages includes the target hardware architecture, so we have rules that can pick that up and use it to create the name. https://github.com/bazelbuild/rules_pkg/tree/main/examples/naming_package_files

In any cases, yes output groups do work, though with a bit more awkwardness.

True. But OutputGroupInfo solves the problem of names you can not predict until analysis time, so it's a mixed call.

For context, a real-world example is rules_docker's container_image:

The implicit output targets are: [name].tar: A full Docker image containing all the layers, identical to what docker save would return. This is only generated on demand. [name].digest: An image digest that can be used to refer to that image. Unlike tags, digest references are immutable i.e. always refer to the same content. [name]-layer.tar: A Docker image containing only the layer corresponding to that target. It is used for incremental loading of the layer. Note: this target is not suitable for direct consumption. It is used for incremental loading and non-docker rules should depend on the Docker image ([name].tar) instead. [name]: The incremental image loader. It will load only changed layers inside the Docker registry.

https://github.com/bazelbuild/rules_docker/blob/master/docs/container.md#container_image

As long as this is just variations on [name], then outputs is eaiser to use. But this also seems like something you could do with explicit outputs and a macro wrapping the rule to add explicit output attributes for them. I don't particularly like having to write wapper macros myself, but my objections are mostly about how they show up in documentation and query output.

Using deprecated features, tsk, tsk.

What deprecated feature?

pauldraper commented 2 years ago

What deprecated feature?

Implicit outputs. They are deprecated, noted by both the docs and with the CLI flag.

But they are so useful, rules_docker doesn't care. (Or at least, not enough to change now.)

aiuto commented 2 years ago

What deprecated feature?

Implicit outputs. They are deprecated, noted by both the docs and with the CLI flag.

But they are so useful, rules_docker doesn't care. (Or at least, not enough to change now.)

The outputs_param is deprecated, but having a rule create implicit outputs with declare is not. :-) But that is just semantic nitpicking.

No matter what the resolution to this issue is, we need functionality that lets a rule determine (at analysis time) the actual file name which gets created for an output. That is, wrapper macros are insufficient. The outputs param was a first cut at this - but there are claims it adds unwanted complexity in Bazel (I don't have an opinion on that). Putting things in OutputGroupInfo is a more general approach, but it has the annoyance that you need an intermediate filegroup to refer explicitly to any of the outputs declared at run time.

Perhaps we could do something where attr.output had an additional parameter that defined a label template. Maybe `spec: attr.output(label='${name}.spec'). This would declare a new Label bound to that output from the rule. Other targets could depend on the 'name.spec' name in build files. Even better is if we had a ctx.action that could set the actual file name of an output. Sort of a variant of declare_file, but it only resets the name of an already declared output. I don't claim that this is even possible. I'm just trying to work up APIs that solve user problems and relatively easy to understand.

pauldraper commented 2 years ago

I think attr.output(label=) is good.

Though IDK why implicit outputs were deprecated in the first place.

meteorcloudy commented 2 years ago
image

According to https://buildkite.com/bazel/bazelisk-plus-incompatible-flags/builds/1253

This flag is breaking almost all downstream projects, I believe it's the same for the wider community. Also this flag is marked at P4, if no active plan on migrating the ecosystem, I'll remove "migration-ready" label for now.

github-actions[bot] commented 1 week ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.