bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.96k stars 4.02k forks source link

Enabling unexported ("lambda") rules #14673

Open cpsauer opened 2 years ago

cpsauer commented 2 years ago

Hello, wonderful Bazel folks! Thanks for all you do.

This is a build language feature request.

With the advent of nested functions and lambdas, it's sometimes useful to create rules without a public name. For example, here's (an abbreviated) one that I wrote to be able to easily configure other rules to add platform transitions in my graph. [Maybe a higher order function isn't the most readable example for an issue, but that's also kind of the point: This is logic that's really nice to write once and then hide away.]

def configure_macro_creator(settings_to_override_dict={}, settings_to_add_dict={}):
    transitioning_rule = rule(<...>)
    def configure_wrapper(rule_to_configure):
        def macro_wrapper(name, **kwargs):
            wrapped_name = name + "-configured"
            transitioning_rule(name=name, exports=wrapped_name)
            rule_to_configure(name=wrapped_name, **kwargs)
        return macro_wrapper
    return configure

Anyway, conceptually, it can be quite useful to create a rule without an exported name if it's part of the internals of some macro.

Right now, this fails with Error in unexported rule: Invalid rule class hasn't been exported by a bzl file. But it strikes me that there might be a good way to support this inside, given it's useful. As an example, rules could be registered on creation with some ID, and kept usable in the graph that way.

The current workaround is to also return and store the rule that's intended to be hidden, private implementation, but it feels like it might be possible to support this use case better than that.

Thanks for your consideration, Chris (ex-Googler)

P.S. Related issue, but for repositories and in the other direction https://github.com/bazelbuild/bazel/issues/10441

fmeum commented 2 years ago

There is https://github.com/bazelbuild/bazel/commit/3a267cbf386b44ec6fb8564800bea4205e0a1002 (see also the linked Google doc), which can be used to automate the creation of "hidden" rules without the need to return them from a macro. At least in my limited experiments, the names set via the name attribute were allowed to collide.

cpsauer commented 2 years ago

Wow, spectacular cross-ref, @fmeum. Could be handy to allow some automatic, default name, or document this, but that's definitely a better workaround. [Update: Workaround may, sadly, be on its way out. See #16391]

Looks like we might be both barking up the same tree, which includes (conceptually) adding incoming transitions to existing rules so project-specific configuration is automatically configured in a build?

fmeum commented 2 years ago

It does seem so :). My aim is to provide a generic wrap_with_transition macro that applies configurable settings to an existing rule. The current interface looks like this:

cc_17_library = wrap_with_transition(
    native.cc_library,
    {
        "cxxopt": {
            "@platforms//os:windows": ["/std:c++17"],
            "//conditions:default": ["--std=c++17"],
        },
    },
)

Ultimately, I would like to see something like this added to skylib or another general-purpose rule set.

@cpsauer Feel free to contact me to discuss our common goal and perhaps join forces.

cpsauer commented 2 years ago

:) Yes, exactly the same goals. So glad you replied.

I've pulled together a terse but working/useful version of that for my team--but it turns out we don't really need right now. So it's in the scrapyard. Mostly, yours looks more sophisticated, but I might have at least a few useful things for you.

What would be the best forum for discussing? Maybe in an issue on your repo? (Just tag me there if so.) Also happy to talk live if you want.

[Update: We ended up needing the functionality after all, and have been delighted to join forces with fmeum. His rules_meta has worked great.]

fmeum commented 2 years ago

I opened an issue at https://github.com/fmeum/rules_meta/issues/1, let's continue over there (and perhaps close this issue).

gregestren commented 2 years ago

Reassigned to team-Configurability since this is a known and interesting use case.

Are these patterns working well for you? Runfiles in particular can get messed up since the existing rule doesn't "know" it's being wrapped. I worry that it's hard to provide generic solutions with wrappers since different rule types export their info differently.

fmeum commented 2 years ago

I would say they are working reasonably well in the case where the wrapper and the wrapped target reside in the same package. By symlinking the executable, runfiles appear to be picked up correctly.

What breaks today and would certainly warrant improvement is the story around env, env_inherit and args. These can't easily be made to work with wrappers as they are not represented by providers. There is testing.TestEnvironment which can be used to emulate the env attribute of native rules, but since the underlying provider isn't available in Starlark, it can't be forwarded from a wrapped Starlark rule.

cpsauer commented 2 years ago

Thanks, Greg! Adding my 2c.

I was seeing the result working decently in the library case we were interested in, but are definitely some rough edges you hit trying to make it more general. Other good examples that are harder to bump around than this issue are: https://github.com/bazelbuild/bazel/issues/14669 and the difficulty of wrapping executable rules expressed in @fmeum's repo, I think parallel to what you're saying about runfiles. Another independent difficulty, I think not yet mentioned, is automatically figuring out which options should be appended to vs. overridden, parallel to how they're treated on the command line.

Before we go too far down this path, I do wonder if there might be a better overall solution here for users, but needing the support of additional Bazel primitives.

Conceptually, being able to easily add transition-type configuration to targets in BUILD files seems like an important part of the configurability story. Both in a batch by making a configured rule as above, but also individually. This seems to be extra true for platform-specific targets. I see a lot of people working around this now by having a --config for each platform/target (think, e.g. what tensorflow does) but the resulting build :android --config=android is ugly, and prone to misuse, More importantly, since it's a top-level-only construct, it prevents those outputs from themselves being used recursively in the dependency-graph beauty that (IMO) makes Bazel great. All those get fixed if there's an easy way to bind configurations to targets, rather than just on the definition of new rules. I'm confident we'll need and want something like this down the line, even if my team doesn't need it right now. And I think the lack of it impedes folding in libraries that require configuration (like the tensorflow example), recursively into build graphs.

I can imagine a great experience for this, but I think it'd benefit from new primitives. What if all rules had an optional configurations attribute (like name or visibility), that took a list of dictionaries and transitions? You could configure buildable targets right where they stood, eliminating the duplicate and top-level configuration problem. That is, you could easily express in a BUILD file "this target needs this configuration" It'd be easy to add configuration to existing rules via a macro.

To sum: Adding transitions to new rules is easy, because Bazel supports it directly. I think it's also desirable to be able to easily configure targets where they're defined--as well as easily adding transitions to existing rules, as came up in previous comments here. But probably some new Bazel support would be needed to make that experience great.

Pulling back out, since I've been spinning ideas. Curious what you guys think and what I missed. One other option, probably worse, would be to create an "identity/passthrough" rule that can easily take a transition, and then use that as the basis of these features. That, too, would need new Starlark APIs to work robustly, though.


And if it's okay with everyone, I'm not going to close this just yet. If we're trying to build the subset of the above construction in starlark, I still do think unnamed rules are handy--or at least maybe worth better documenting the naming requirement that gets around this and adding it to the error message?

gregestren commented 2 years ago

I agree with everything you're expressing from a user perspective. We indeed have it as a configurability goal to support that pattern as much as we credibly can. This is pretty close to what we consider "flagless builds".

There are important nuances. Some is simple design: sometimes dependency injection can offer similar benefits more cleanly and efficiently. And every dep graph that depends on configuration is harder for automated tools like query (not cquery), BUILD parsers, etc. to understand. So it changes how easy it is for tools to manipulate build graphs.

The most important reason we haven't enabled arbitrary configuration changes on targets is efficiency. This is a huge challenge. It's extraordinarily easy to construct build graphs that are multiple times slower, larger, less efficient. https://docs.bazel.build/versions/main/skylark/config.html#memory-and-performance-consideration traces out an example. Configurations are a powerful yet dangerous tool and we want to be extra-careful users apply them responsibly and with full understanding of the efficiency consequences.

This is why we're also pursuing backend efficiency work like cross-platform-cacheable actions, --trim_test_configuration, efficient Starlark transition hashes, and tools to measure efficiency impacts. I can link you more precisely to any of these.

We're pushing most emphasis now on standardizing the platforms API across rules in the name of simplicity, interoperability, fewer more common flags. Once we've stabilized that we can imagine supporting, e.g., my_taget(platform_to_build_for = ["//:arm], providing tooling to understand how that impacts build memory, speed, and RE costs, and optimizing inefficiencies as best we can.

This is all large-scale work I expect to progress incrementally.

cpsauer commented 2 years ago

"Flagless builds" is a great name for that goal :)

If useful: As a user, the possibility of the exponential blowup you linked to occurred early on in my first read of the docs--before even I'd gotten down to that section. But it seemed unlikely that we'd do it accidentally without e.g. also causing duplicate symbol errors. My point is, I found those docs to be great, and they more than made the point to me, except maybe that I'd concluded we might already be fairly well-defended by the linker. So I turned my focus toward wanting to use the feature!

Anyway, thanks Greg, for asking great questions, caring about users, and being so thoughtful in your replies. I'm continuously impressed. Thanks for getting us to the root issues--and being way ahead in your thinking.