bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.23k stars 4.07k forks source link

Allow per-target Java runtime selection for java_binary/java_test #22866

Open timothyg-stripe opened 4 months ago

timothyg-stripe commented 4 months ago

Description of the feature request:

java_binary and java_test currently use toolchain resolution to discover the Java runtime toolchain to use, based on the value of --java_runtime_version= (and --tool_java_runtime_version=) command-line options (aside from platform matching). However, toolchain resolution is rather cumbersome if we want to use different Java runtime toolchains for different targets.

Going further, we often want two distinct java_binary targets to share many of the same java_library dependencies. Since java_runtime_version is part of the configuration, naïvely setting different java_runtime_versions on the two java_binarys with a transition will often result in shared java_librarys being compiled twice, in two different configurations' output directories.

It would be great if users are allowed to optionally specify which Java runtime to use through an attribute, tentatively named java_runtime, on java_binary/java_test. If this attribute is not specified, those rules can fall back to the current toolchain resolution logic.

Note: This feature request does not concern compile-time Java versions.

Which category does this issue belong to?

Configurability, Java Rules

What underlying problem are you trying to solve with this feature?

We have a large heterogeneous JVM monorepo that support both backend services and data jobs. Some targets (e.g., backend services) would like to run on a newer Java runtime, while the Java runtime version for data jobs are often constrained by the open-source framework we use. (E.g., Apache Flink does not yet support JDK 21.)

To our knowledge, there are three existing ways to change runtime JDK on a per-target basis. But they all leave something to be desired.

  1. Use an incoming edge transition that changes java_runtime_version for the different binaries.
    1. Two downsides of this approach are: (a) java_binary, being a built-in rule, does not allow setting incoming-edge transitions; (b) if we bazel build two binaries with different runtime JDKs at the same time, libraries shared by binaries will get compiled twice.
    2. To resolve (b), we could additionally use an outgoing edge transition on deps to revert the java_runtime_version back to the "default" version. But that brings more complexity, and doesn't solve all the problems. (E.g., Bazel's output directory computation uses the top-level target's configuration as the "default" configuration. If this "default configuration" can change for different targets, then developers will see spurious rebuilds. This turns out to be false – I stand corrected.)
  2. Use the launcher attribute of the java_binary to call a JDK other than the one Bazel thinks it should use. This requires us to write some C++ code (since launcher must be a cc_binary), doesn't allow sharing action cache keys between different architectures, and overall feels like a hack.
  3. Switch out the _stub_template private attribute on java_binary to use a script that looks for another JDK to call into. This relies on Bazel implementation details, and is even more of a hack.

Which operating system are you running Bazel on?

macOS

What is the output of bazel info release?

7.1.2

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

There are two instances of prior art:

Any other information, logs, or outputs that you want to share?

No response

timothyg-stripe commented 4 months ago

cc @cushon

timothyg-stripe commented 4 months ago

See #22881 for a sample implementation of this.

hvadehra commented 4 months ago

naïvely setting different java_runtime_versions on the two java_binarys with a transition will often result in shared java_librarys being compiled twice, in two different configurations' output directories.

Is this still true? Analysis will be duplicated, but I would have thought double compilation could be avoided with --experimental_output_paths (https://github.com/bazelbuild/bazel/discussions/22658)?

timothyg-stripe commented 4 months ago

Sure, yeah, output path mapping will solve the double compilation. But the feature doesn't appear to be ready for widespread adoption yet, and in any case it would take up double the disk space for output base…

cushon commented 4 months ago

I think the transition approach is more robust. Most of the shared dependencies are going to be the same, but I have seen examples of things like library code that bake in a path to the configured Java runtime, which would break if the runtime was swapped out at the top-level java_binary target.

timothyg-stripe commented 4 months ago

That's not currently possible because java_binary is a built-in rule, and we can't attach a transition onto a built-in rule… (Also, java_binary is a macro in Bazel, so it doesn't work even if I were to enable the experimental rule extension API…)

I have seen examples of things like library code that bake in a path to the configured Java runtime

We're not concerned about this because we own all the code in the repo.

timothyg-stripe commented 4 months ago

In any case, in my mind there's little downside for adding an attribute. The change is quite simple and self-contained, and is only opt-in.

cushon commented 4 months ago

I think the main downside in adding the attribute would be that it would be available to all users of java_binary and java_test, but it wouldn't work correctly for repos that reference the configured java runtime outside top-level binary and test targets.