bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.76k stars 3.99k forks source link

Ensure only a minimal set of external dependencies vendored for building a hello-world binary #22684

Open meteorcloudy opened 1 month ago

meteorcloudy commented 1 month ago

Description of the bug:

C++ hello-world

Using C++ binary as an example.

To build a C++ hello-world binary offline

cc_binary(
  name = "bin",
  srcs = ["main.cc"],
)

we'll have to vendor the following repositories with Bazel built at 96b90baf08172bb55844c2366111a0c39b6f9e14 or later.

While, the first four as acceptable, rules_java and rules_python should not be required by default for building a simple C++ hello world program.

The same can be apply to building hello-world program of other languages.

Java hello-world

Suggested solution

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

https://github.com/meteorcloudy/minimal_vendor_test/tree/main

Which operating system are you running Bazel on?

macOS, Linux, Windows

What is the output of bazel info release?

96b90baf08172bb55844c2366111a0c39b6f9e14 or later

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

fmeum commented 1 month ago

Since rules_java and rules_python both register toolchains , I don't see a way to remove them short of getting rid of the dependency from bazel_tools alltogether.

meteorcloudy commented 1 month ago

I think we can probably get rid of rules_python in MODULE.tools, I don't see any obvious reason why it should be there.

fmeum commented 1 month ago

I think that if we remove it, projects that use py_binary without a load or dependency on rules_python would no longer find a Python toolchain.

meteorcloudy commented 1 month ago

Ah, I see, thanks for pointing that out. Then, we can probably only do that in Bazel 8.

meteorcloudy commented 1 month ago

@katre @Wyverald As you can see, many dependencies are required only because they register toolchains. Should we consider introducing a way for the root module to exclude toolchains registered by modules in the dependency graph?

Maybe a simple

toolchain_override(
  exclude = "*",
)

?

katre commented 1 month ago

If the toolchains aren't needed, why is rules_java a dependency? Is it somehow referred to from @bazel_tools? Do you have a code link I can check?

meteorcloudy commented 1 month ago

yes, rules_java is in MODULE.tools: https://cs.opensource.google/bazel/bazel/+/master:src/MODULE.tools;l=8?q=MODULE.tools&ss=bazel

And the toolchain is registered in rules_java's MODULE.bazel:

katre commented 1 month ago

Okay, did a bit of testing.

The @rules_java dependency is added in MODULE.tools, and referenced from a number of places, but most importantly from tools/jdk/BUILD.tools, which creates a number of aliases for legacy targets.

Of these, the most important seems to be @bazel_tools//tools/jdk:current_java_runtime, which is itself referenced from a number of places, mostly from the native and builtin Java semantics (and also from the Android rules).

I see a lot of test failures by simply removing these, but it's very unclear which are actual failures, and which are tests that implicitly require Java to work but don't actually add a rules_java dependency.

Trying to follow this further will take a lot of effort and cleanup (if nothing else, to make our integration tests be honest about what languages they require).

lberki commented 1 month ago

Oof. Do I understand correctly that the root cause is that Python/Java rules require the Python/Java toolchains which are registered in @rules_python / @rules_java, therefore, if we don't depend on them by default, Python and Java rules cease working. The fix for that is to explicitly require a reference to the rules one needs but that can't happen before Bazel 8 because it would be a very incompatible change?

Owie, then this will require quite a bit of thought to figure out a reasonable (=can be done quickly and doesn't cause too much collateral damage) fix for...

And it is an entirely reasonable request that builds that don't use Python or Java don't depend on their respective rule sets, in fact, it's quite embarrassing that they currently do so...

katre commented 1 month ago

It's possible that if the java rules are completely unused, then there is no need for rules_java, it's just very very hard to test that given the current state of things.

Setting up a small toy project and trying to build it with a suitably hacked up bazel is probably the best way to test this.

lberki commented 1 month ago

That, I hope, but it doesn't help by itself because there is no way to tell Bazel "I don't need @rules_java, pinkie promise, please don't ever look at it", is there?

meteorcloudy commented 1 month ago

It's possible that if the java rules are completely unused, then there is no need for rules_java, it's just very very hard to test that given the current state of things.

You can use a local_path_override to override rules_java to an empty module, if the build still works, then rules_java isn't needed.

lberki commented 1 month ago

cc @comius -- this discussion might have some bearing on the eventual approach to decoupling rule implementations from the Bazel binary.

katre commented 1 month ago

That's essentially what @meteorcloudy was asking for in https://github.com/bazelbuild/bazel/issues/22684#issuecomment-2160601261

Maybe a simple

toolchain_override(
  exclude = "*",
)

I'm very ambivalent about this: I don't think it's a huge amount of work to update RegisteredToolchainsFunction to handle it, but plumbing the new data through will be tricky (check the implementation of register_toolchains for a comparison).

But it's very inelegant: it's much better to simply not depend on something you don't intend to use. If @bazel_tools current depends on @rules_java we're better off trying to break the dependency than just hiding it and papering over.

katre commented 1 month ago

Relatedly, are we shipping any source tools as java_binary targets?

meteorcloudy commented 1 month ago

IIRC, the fundamental problem is that repos defining those toolchains are not fetched lazily because we'll need to fetch the repo to know which toolchain type it is. If this problem is hard to solve in Bazel, then letting users decide which toolchains to be excluded would help avoid unnecessary external dependencies.

meteorcloudy commented 1 month ago

Relatedly, are we shipping any source tools as java_binary targets?

Probably not? /cc @hvadehra

fmeum commented 1 month ago

I don't think there is much of a point in artifically cutting the dependency of a trivial C++ hello world on the rules_java repo. When the Java rules have been moved out of Bazel, this dependency will naturally be gone, but any non-trivial Bazel repo will depend on rules_java anyway.

For example, as soon as you run bazel coverage on a trivial C++ example, a Java runtime will be downloaded to run the coverage merger, at which point the actual rules_java repo is the least of your worries in terms of vendoring :-) Since ruleset repos are tiny and don't contain binaries, I don't think we need to worry about them too much.

Wyverald commented 1 month ago

I agree with Fabian here -- I think the natural solution of "just let Starlarkification run its course" is the best. When all java_* rules are loaded from rules_java, we can simply remove rules_java from bazel_tools, which would get rid of the toolchain registration for projects that don't need it. Trying to work around it earlier will likely just add to our tech debt.

Just as an aside: in case this wasn't clear before, we'll likely never be able to completely empty out bazel_tools. Stuff like the built-in repo rules (http_archive and co.), the shell runfiles library, and the default host platform (@bazel_tools//tools:host_platform) will always need to remain bundled with Bazel, even after we've completed Starlarkification.

lberki commented 1 month ago

The point of putting in the legwork to make sure that a C++ hello world doesn't depend on Java is that then we know we don't have unexpected dependencies and that it's existence proof that it's possible to run Bazel builds that don't depend on @rules_java / @rules_python.

I'm not too worried about that one unexpected dependency edge, but I am worried about the fact that vendoring @rules_java is now pretty much a requirement for builds without Internet access. Firstly, it's a design smell, secondly, I could imagine some trouble brewing if @rules_java suddenly decides have a mandatory dependncy on some large Java tool (or a JDK, etc.)

I do realize that @bazel_tools will never be completely empty; I wanted to express myself concisely instead of precisely. I'm sure we all agree, though, that right now it has way more things in it than it should have.

fmeum commented 1 month ago

We do already have tests that verify that cc_test doesn't depend on certain unexpected tools: https://cs.opensource.google/bazel/bazel/+/0dbfaccaf5bee5ea7f11c01db1fc0cd1ca7f3810:src/test/shell/bazel/cc_integration_test.sh;l=1683. We could extend this to ensure that there are no unexpected rules_java deps without getting rid of the toolchain registration edge.

Doing that via a bazel vendor test would be both more effort and less granular (repo vs target level). If we also add a test to rules_java that detects undesired edges to other repos, we could even catch any accidentally eager fetches of toolchains early.

meteorcloudy commented 1 month ago

I think the problem we want to solve is this: For users who want to vendor external dependencies to achieve offline build, they don't have to vendor repos they don't need just because of toolchain registrations.

This is important for projects like Kleaf which has a strict policy on vendoring 3rd party dependency.

And it's still a problem after we starlarkifying rules_java and move it out of bazel_tools. For example, if kleaf depends on rules_foo, and rules_foo depends on rules_java because a binary tool needs it to provide some optional feature. However, Kleaf knows it doesn't want the optional feature and doesn't need Java toolchains anywhere else. Currently, there is no nice way to avoid vendoring rules_java and potentially other repos introudced during toolchain registration (e.g. bazel_features).

I think the solution might be to offer a way for the root module to exclude toolchain registrations from given modules. Something like

toolchain_exclude(
   module_names = ["rules_java"],
   # maybe also: exclude_all = True,
)

We do already have tests that verify that cc_test doesn't depend on certain unexpected tools: https://cs.opensource.google/bazel/bazel/+/0dbfaccaf5bee5ea7f11c01db1fc0cd1ca7f3810:src/test/shell/bazel/cc_integration_test.sh;l=1683.

This test doesn't guard what repos need to be fetched for building the target. For two reasons: 1) toolchain registration 2) irrelevant load statements. That's why we had to traverse the skyframe graph to discover all repos to vendor for given targets. Therefore, I think bazel vendor is the only way to discover full repo dependencies for targets so far.

meteorcloudy commented 1 month ago

we'll likely never be able to completely empty out bazel_tools

I think the actual goal is to just make MODULE.tools empty instead of making the whole bazel_tools empty? So far, I only see bazel_dep(name = "buildozer", version = "7.1.2") to be a problem, but it has little negative impact.

lberki commented 1 month ago

Yeah, I'm supportive of the idea of being able to say "to the best of knowledge, my builds don't need Java and please error out if it touches anything that touches Java".

Practically speaking, I could imagine a knob in MODULE.bazel that says "depend on these builtin language families" and if the user doesn't say anything, they get all of them. This would provide a convenient way to cut off rule implementations from Bazel if we eventually switch the default to "nothing".

I haven't thought the mechanics through, though. Maybe @comius knows with certainty whether this is a good idea?

fmeum commented 1 month ago

Yeah, I'm supportive of the idea of being able to say "to the best of knowledge, my builds don't need Java and please error out if it touches anything that touches Java".

That should already be possible today: Add third_party/rules_java_stub/MODULE.bazel with content module(name = "rules_java") to your repo and then either add --override_module=%workspace%/third_party/rules_java_stub to .bazelrc or add a local_path_override to MODULE.bazel. If bazel vendor doesn't skip overriden repos today, we could make it do so. Since the module file is empty, it won't register any toolchains and since there are no other files in the module, any references to @rules_java that were missed will result in a "thing not found" error.

meteorcloudy commented 1 month ago

@fmeum Ah, that's a good point and it's basically what I suggested at https://github.com/bazelbuild/bazel/issues/22684#issuecomment-2160961838. I can confirm local_path_override repo is already excluded from vendor.

Then toolchain_exclude may not be that necessary. I'll suggest the Kleaf team to try out this first.

jacky8hyf commented 1 month ago

This could work if we always use --override_module in the command-line.

But unfortunately we can't put it as a default in .bazelrc or in MODULE.bazel local_path_override. This is because we also have stardoc() targets, contained within a package. When building these targets, we allow Internet access because stardoc pulls a bunch of maven dependencies, and it is a big hassle to pull them to the Android tree. So we only build these targets on developer's machines and regularly check-in the generated markdown files to the source tree. We do not build these stardoc() targets or even visit that package when in a contained environment (CI build machines without Internet access).

So, if we always override rules_java with a fake module, is there a way for us to offset the effect of --override_module when building stardoc? For example, can we make --override_module=rules_java= clear any previous values of --override_module for rules_java set in .bazelrc or in MODULE.bazel?

(on behalf of the Kleaf team)