Open c-parsons opened 5 years ago
I've spoken to authors of rules_apple (@thomasvl @allevato @sergiocampama) who have established some pains with respect to this change. These are issues we should work to resolve before proceeding with this migration.
my_rule_private.bzl:
my_rule = rule(...)
my_rule.bzl:
load(':my_rule_private.bzl', _my_rule="my_rule")
def my_rule(name, **kwargs):
_my_rule(name = name, kwargs)
Loading an additional bzl file has performance implications, and this pattern requires annoying separation between related bits of code.
Possible workaround: One workaround would be if we allow rules to define their proper name to be different than their exported name. For example:
_my_rule = rule(exported_name = "my_rule", ...)
def my_rule(name, **kwargs):
_my_rule(name = name, kwargs)
This seems a high priority feature for Stardoc, and perhaps a blocker for this migration.
For completeness, we also discussed a question I have been asked before regarding this migration: Q: Aren't macros a thing we want to avoid? Why is the recommendation to use a macro to set output values? A: No, Starlark team discourages complex macros that users can depend on. If there is a 1:1 correspondence between macro and subsequent rule, the macro should be OK for users to depend on. That is all this migration plan suggests.
Aside: rules_apple/rules_swift can look into using output groups as well (which would avoid the need for a macro), but those output groups can't be referenced directly by other rules (i.e., there's no equivalent to :Foo.ipa
), right?
That's correct -- the only way to make optional outputs specifiable by label is to use attr.output
or attr.output_list
, whose values must be set by macro.
FWIW, the correct approach is to use both output groups and attr.output
. The former allows a user to build Foo.ipa by specifying:
bazel build //my:Foo --output_groups=ipa
and the latter allows building Foo.ipa by specifying:
bazel build //my:Foo.ipa
(or similarly a dependency on //my:Foo.ipa
from another target)
Looks like this is incompatible with rule transitions, I get the following error:
java.lang.RuntimeException: Unrecoverable error while evaluating node X:X.out BuildConfigurationValue.Key[ee7a6eae9f289e48bf8a1e925f4979c1] false' (requested by nodes )
at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:528)
at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:399)
at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1396)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:283)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.localPopAndExec(ForkJoinPool.java:950)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1607)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: com.google.common.base.VerifyException: Could not find prerequisite X:X in the expected configuration
at com.google.common.base.Verify.verify(Verify.java:124)
at com.google.common.base.Verify.verifyNotNull(Verify.java:500)
at com.google.devtools.build.lib.analysis.TargetContext.findDirectPrerequisite(TargetContext.java:113)
at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createConfiguredTarget(ConfiguredTargetFactory.java:213)
at com.google.devtools.build.lib.skyframe.SkyframeBuildView.createConfiguredTarget(SkyframeBuildView.java:774)
at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.createConfiguredTarget(ConfiguredTargetFunction.java:891)
at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.compute(ConfiguredTargetFunction.java:351)
at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:451)
Granted, this is also an issue with implicit outputs, just wanted to bring this into attention in case the solution is different.
Without the ability to manually specify the rule kind, encouraging the macro-based solution is going further obscure the fact that the left-hand-side of the assignment from the return value of rule
is part of the public API surface of a rule. We already have enough of a problem with this as it is (to say nothing of mnemonic
) and I think the current direction in this bug will make it worse.
+1 to @shahms coment. Using the macro to wrap the rule needlessly obscures the true rule.. It also makes it harder to automatically generate documentation, because now we have a forwarder in the middle. It would not so bad if we still had default on attr.output, but that went away.
Chris, could you update the status, and the priority if this is not active work? Thanks.
Indeed, this has been deprioritized but this bug was not correctly updated to reflect the lower priority. My bad.
I frankly haven't looked at this in quite some time. I think my previous comment is the best summary of this bug.
As a user of rules, the real problem I have with outputs is that there are outputs I cannot reference directly. Unfortunately, this change risks encouraging this poor behavior exactly where it's most vexing.
The rules that have the greatest need for declaring outputs with patterned names are complex rules that output files of different types (e.g. debug info, source maps, loaders etc.). Because these files have different purposes, it’s important to be able to pick the one you want instead of grabbing the whole bag.
This use case is well-served by rule.outputs: you declare your outputs, name them with patterns, and they are “made available as labels” (user POV: they work).
The change proposed here means that even if rules declare their outputs, they “won’t be made available as labels” (user POV: they’re broken, and I need to use a genrule to make a copy of the file I want, which will then work). To get working outputs rule authors will need to go out of their way and write a macro, which they won’t do until thirty users run into the issue and give up, twenty run into it and hack a workaround, and the fifty-first files a bug, at which point the macro may be added. Rinse and repeat for each new rule.
BTW, I took the user POV here, but in reality I was also the author of the rule I was having an issue with. Both of me agree that we want declared labeled outputs in the rule.
Update: We'd still like to remove implicit outputs, but at the moment don't have the bandwidth to reassess the alternatives users would migrate to. We won't actually flip the flag until we have time to address that.
Is it possible to get some more attention on this?
Huge numbers of projects (e.g. rules_docker, rules_nodejs, I believe many others) are using this feature extensively. There's no clear guidance on why it's deprecated or what they should be doing instead.
It's uncomfortable to build lots of code around deprecated behavior, but these projects have found it so useful, they do it anyway.
Going back to
The rules that have the greatest need for declaring outputs with patterned names are complex rules that output files of different types (e.g. debug info, source maps, loaders etc.). Because these files have different purposes, it’s important to be able to pick the one you want instead of grabbing the whole bag.
Let's do an example. Let's assume we have a compiler that produces a debug file, but the name should be based on a value we get from the toolchain. We can not know this until Analysis time, so the current outputs functionality does not even work. Let's call that the 'debug' output.
The way I approach this is:
ctx.label.name + '_' + toolchain['fropnitz'] + '.dbg'
)OutputGroupInfo
filegroup(..., output_group = "debug", ...)
to turn that slice of the rule output into a usable label.We have an actual case in rules_pkg
. The pkg_deb
rule emits the .deb file as the primary output, and the .changes file name is calculated from that. I might right something like:
pkg_deb(name = "deb", ...)
filegroup(
name = "the_changes_file",
srcs = [":deb"],
output_group = "changes",
)
genrule(
name = "use_changes_file",
srcs = [":the_changes_file"],
outs = ["copy_of_changes.txt"],
cmd = "cp $(location :the_changes_file) $@",
)
This might appear a little clunky, but often the filegroup
s can be created by a macro wrapper around the rule.
Output groups work, though they are definitely less ergonomic than implicit outputs.
Let's assume we have a compiler that produces a debug file, but the name should be based on a value we get from the toolchain.
Is that....is that a real example?
In any cases, yes output groups do work, though with a bit more awkwardness.
For context, a real-world example is rules_docker's container_image
:
The implicit output targets are:
[name].tar: A full Docker image containing all the layers, identical to what docker save would return. This is only generated on demand.
[name].digest: An image digest that can be used to refer to that image. Unlike tags, digest references are immutable i.e. always refer to the same content.
[name]-layer.tar: A Docker image containing only the layer corresponding to that target. It is used for incremental loading of the layer.
Note: this target is not suitable for direct consumption. It is used for incremental loading and non-docker rules should depend on the Docker image ([name].tar) instead.
[name]: The incremental image loader. It will load only changed layers inside the Docker registry.
https://github.com/bazelbuild/rules_docker/blob/master/docs/container.md#container_image
Using deprecated features, tsk, tsk.
Output groups work, though they are definitely less ergonomic than implicit outputs.
Let's assume we have a compiler that produces a debug file, but the name should be based on a value we get from the toolchain.
Is that....is that a real example?
Not the compiler, but the example is real for pkg_deb. The Debian standard for naming packages includes the target hardware architecture, so we have rules that can pick that up and use it to create the name. https://github.com/bazelbuild/rules_pkg/tree/main/examples/naming_package_files
In any cases, yes output groups do work, though with a bit more awkwardness.
True. But OutputGroupInfo solves the problem of names you can not predict until analysis time, so it's a mixed call.
For context, a real-world example is rules_docker's
container_image
:The implicit output targets are: [name].tar: A full Docker image containing all the layers, identical to what docker save would return. This is only generated on demand. [name].digest: An image digest that can be used to refer to that image. Unlike tags, digest references are immutable i.e. always refer to the same content. [name]-layer.tar: A Docker image containing only the layer corresponding to that target. It is used for incremental loading of the layer. Note: this target is not suitable for direct consumption. It is used for incremental loading and non-docker rules should depend on the Docker image ([name].tar) instead. [name]: The incremental image loader. It will load only changed layers inside the Docker registry.
https://github.com/bazelbuild/rules_docker/blob/master/docs/container.md#container_image
As long as this is just variations on [name], then outputs is eaiser to use. But this also seems like something you could do with explicit outputs and a macro wrapping the rule to add explicit output attributes for them. I don't particularly like having to write wapper macros myself, but my objections are mostly about how they show up in documentation and query output.
Using deprecated features, tsk, tsk.
What deprecated feature?
What deprecated feature?
Implicit outputs. They are deprecated, noted by both the docs and with the CLI flag.
But they are so useful, rules_docker doesn't care. (Or at least, not enough to change now.)
What deprecated feature?
Implicit outputs. They are deprecated, noted by both the docs and with the CLI flag.
But they are so useful, rules_docker doesn't care. (Or at least, not enough to change now.)
The outputs_param is deprecated, but having a rule create implicit outputs with declare is not. :-) But that is just semantic nitpicking.
No matter what the resolution to this issue is, we need functionality that lets a rule determine (at analysis time) the actual file name which gets created for an output. That is, wrapper macros are insufficient. The outputs param was a first cut at this - but there are claims it adds unwanted complexity in Bazel (I don't have an opinion on that). Putting things in OutputGroupInfo is a more general approach, but it has the annoyance that you need an intermediate filegroup to refer explicitly to any of the outputs declared at run time.
Perhaps we could do something where attr.output had an additional parameter that defined a label template. Maybe `spec: attr.output(label='${name}.spec'). This would declare a new Label bound to that output from the rule. Other targets could depend on the 'name.spec' name in build files. Even better is if we had a ctx.action that could set the actual file name of an output. Sort of a variant of declare_file, but it only resets the name of an already declared output. I don't claim that this is even possible. I'm just trying to work up APIs that solve user problems and relatively easy to understand.
I think attr.output(label=)
is good.
Though IDK why implicit outputs were deprecated in the first place.
According to https://buildkite.com/bazel/bazelisk-plus-incompatible-flags/builds/1253
This flag is breaking almost all downstream projects, I believe it's the same for the wider community. Also this flag is marked at P4, if no active plan on migrating the ecosystem, I'll remove "migration-ready" label for now.
Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.
This flag disables the
outputs
parameter on therule()
function.Flag: --incompatible_no_rule_outputs_param
Motivation Output Map Madness outlines some problems and inconsistencies in using this pattern, and there are also a couple of github issues (#2883 and #5581) on this matter.
Additional details in #6241
Migration Use
OutputGroupInfo
specify output files instead.For example, replace:
with:
Note that this migration approach does not automatically let you specify these outputs by label. For example, you will notice that, with this exact approach, you cannot specify the ".exe"-suffix as its own label. In order to preserve this implicit-output functionality, you will need to use
attr.output
with a macro-specified default. For example: