bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.13k stars 4.05k forks source link

`attr` module not rich enough to describe `bzlmod` attributes #14572

Closed shs96c closed 2 years ago

shs96c commented 2 years ago

Description of the problem / feature request:

rules_jvm_external (RJE) uses macros to allow people to specify complex maven requirements, such as exclusions, or to ask for a particular maven classifier to be downloaded.

The Bazel attr module isn't rich enough to express this kind of content, but the macro in RJE generates a JSON string which can be passed in as an attr.string and then decoded using json.decode to revivify it. It's not lovely, but it does work.

Without access to macros in a MODULE.bazel, it's not possible to replicate this workflow, and since the attr module lacks the richness to allow this to be specified it won't be possible to migrate RJE to use modules.

Feature requests: what underlying problem are you trying to solve with this feature?

I guess that there are three possibilities for fixing this:

The last of these is probably the most realistic :)

What's the output of bazel info release?

5.0.0rc3

Wyverald commented 2 years ago

Unfortunately the last of these is actually the least realistic, since MODULE.bazel evaluation happens before the dependencies are fetched (or even which version we need is determined at all). This is the real reason why load statements aren't permitted in MODULE.bazel files, because it's simply impossible to support. (Similarly, tags aren't type-checked until right before module extension evaluation starts; before that, tags are just stored verbatim.)

I think a fourth possibility might be just to "deal with it" and use an API that doesn't include very complex attribute structures. attr is intentionally limited in its expressiveness mostly for the consideration of BUILD files (see linked code comment below). The same arguments don't apply as much to MODULE.bazel files, but still, food for thought.

https://github.com/bazelbuild/bazel/blob/3d78dc7e3a927612c23a6d1542bc38aad5a9703e/src/main/java/com/google/devtools/build/lib/packages/Type.java#L58-L80

shs96c commented 2 years ago

Can you suggest a way of expressing this kind of constraint in a way that "deals with it", but which is also something users might be comfortable using? Users do expect to be able to specify exclusions on a per-artifact basis, and I can't see a particularly neat way of doing this.

Perhaps a use_macro(bzl_file, macro_name) could be provided? That would be sufficient to allow us to do what we need to do, and presumably it could follow the same semantics about visibility as use_extension does?

fmeum commented 2 years ago

I wonder whether most of these use cases for complex attributes couldn't be solved by offering an equivalent of the following Starlark function in MODULE.bazel:

def complex_value(**kwargs):
    return "<some UUID>" + json.encode(kwargs)

This would allow for a uniform way to pass essentially arbitrary Starlark values into module tag attributes without having to add any complex new feature to Bazel. Module extensions could just consume these values via attr.string() or attr.string_list() and e.g. bazel_skylib could provide convenience functions such as is_complex_value and parse_complex_value (that essentially detect and strip off the UUID.

@shs96c What do you think, could this "generic" version of maven.artifact be usable enough?

Interestingly and very much unlike BUILD files, MODULE.bazel does allow function definitions. That means that if a future version of Bazel were to introduce such a global helper function, it would even be possible to offer a polyfill for Bazel 5.0.0 users they could just paste into their MODULE.bazel. @Wyverald Is it intended behavior that MODULE.bazel permits function definitions?

shs96c commented 2 years ago

At that point, would it be simpler to have an attr.json? That would allow someone to pass in whatever shaped data they wanted, and the rule implementation function could get the decoded json structure back out (without needing to call json.decode)

It’s effectively the same as this suggestion, but requires fewer additional support functions or changes to how modules are processed.

Sent from my iPhone

On 13 Jan 2022, at 22:43, Fabian Meumertzheim @.***> wrote:

 I wonder whether most of these use cases for complex attributes couldn't be solved by offering an equivalent of the following Starlark function in MODULE.bazel:

def complex_value(**kwargs): return "" + json.encode(kwargs) This would allow for a uniform way to pass essentially arbitrary Starlark values into module tag attributes without having to add any complex new feature to Bazel. Module extensions could just consume these values via attr.string() or attr.string_list() and e.g. bazel_skylib could provide convenience functions such as is_complex_value and parse_complex_value (that essentially detect and strip off the UUID.

@shs96c What do you think, could this "generic" version of maven.artifact be usable enough?

Interestingly and very much unlike BUILD files, MODULE.bazel does allow function definitions. That means that if a future version of Bazel were to introduce such a global helper function, it would even be possible to offer a polyfill for Bazel 5.0.0 users they could just paste into their MODULE.bazel. @Wyverald Is it intended behavior that MODULE.bazel permits function definitions?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.

fmeum commented 2 years ago

Yes, that would be easier to work with for both module authors and Bazel users, but given the experience with adding new types described in the comment on Type.java, I don't think that a general new attr type has a chance to make it into Bazel. I don't know how complex it would be to add a new attr type for use in MODULE.bazel only, hence my idea for a different solution.

shs96c commented 2 years ago

In a new attribute type isn't a possibility, a use_macro function feels similar to the complex_value suggestion above, but offers more flexibility for unforeseen use cases.

illicitonion commented 2 years ago

Just to +1 the need to solve this somehow more than just living with it, here's an example from rules_rust's equivalent of rules_jvm_external:

https://github.com/bazelbuild/rules_rust/blob/ea22478f6dac4604b4ef38fb8e713cd1c0880084/examples/crate_universe/has_aliased_deps/workspace.bzl#L12-L34

This entire object is just sufficient to specify the parameters to build one moderately complex third party dependency - it's conceivable that a single WORKSPACE may contain tens of this kind of attribute.

Wyverald commented 2 years ago

@fmeum

Is it intended behavior that MODULE.bazel permits function definitions?

No, it's a TODO left in the code :P there shouldn't be any def, if, etc. statements in MODULE.bazel, not until we consciously allow them.

@shs96c

Can you suggest a way of expressing this kind of constraint in a way that "deals with it", but which is also something users might be comfortable using? Users do expect to be able to specify exclusions on a per-artifact basis, and I can't see a particularly neat way of doing this.

We talked about this offline today, but for posterity: what I suggest is that we exploit the flexibility of the tag syntax. Say you have something like the following in today's rules_jvm_external:

maven_install(
    name = "maven",
    artifacts = [
        maven.artifact(
            "foo:bar:1.0",
            maven.exclude(
                foo = "bar",
                bar = "foo",
            ),
        ),
        "blah:bleh:2.0",
        "blih:bluh:3.0",
    ],
)

It's all in one call because we're generating just one repo. But because tags are aggregated in Bzlmod, we could just split this into multiple tags, effectively "flattening" the attribute structure:

maven.install(coords = [
    "blah:bleh:2.0",
    "blih:bluh:3.0",
])
maven.artifact(coord = "foo:bar:1.0", exclude = { "foo" : "bar", "bar" : "foo" })

This isn't always easy (expecially if you have very complex inputs with 3+ layers), but it probably will deal with most sensible cases.


Like others have alluded to, an attr.json type is definitely not making it into the BUILD language. It could work as a Bzlmod-only attribute type but that is obviously not ideal either: discrepancy between use sites of attr isn't great, and considering we (Bzlmod) have similar needs as the BUILD language regarding attributes such as serializing and tracking the labels inside, we'd have similar reservations about introducing it anyway.

I'd like to encourage people to try the "flattening" idea above and see whether it works for your use case. If it's still really awkward, we can deliberate further.

Wyverald commented 2 years ago

Closing this for now. If there are cases where the flattening mentioned above doesn't work, we can reopen.