bazelbuild / stardoc

Stardoc: Starlark Documentation Generator
Apache License 2.0
103 stars 40 forks source link

Unbreak distro tarball after #236 #241

Open tetromino opened 1 week ago

tetromino commented 1 week ago

The new protoc toolchain must be available to users of the distro version of our MODULE.bazel and setup.bzl / deps.bzl for WORKSPACE users.

Add a distro-manual-test.sh script to make distro tarball generation easier to test for maintainers.

Fixes #240

fmeum commented 1 week ago

I do see this as potentially problematic: toolchains should ultimately be the responsibility of the end user (or dedicated toolchains modules such as toolchains_protoc or perhaps rules_java). I'm worried that this could lead to confusing "spooky action at a distance" where a stardoc version bump ends up breaking your proto build that happened to rely on it.

I could see a few alternatives:

  1. Since stardoc is a tool, not a library, register toolchains with special, stardoc-specific constraints that prevent them from leaking into the rest of the build.
  2. Have rules_java or toolchains_protoc define a default proto_lang_toolchain that can only be used in the exec configuration and that other rulesets can rely on.
  3. Document the manual steps end users need to take.

Cc @alexeagle @comius

lberki commented 1 week ago

@katre @Wyverald WDYT about the "registering toolchains for the users" pattern?

My first, naive thought is that the approach in this pull request is better than the alternative, which is to expose the "implementation detail" that protoc is required for Stardoc. But that's just that, my first, naive thought so don't take it as gospel. @katre and @Wyverald have thought about this way more than I did.

tetromino commented 1 week ago

@alexeagle and @fmeum - thanks for the review! Two questions:

fmeum commented 1 week ago

For WORKSPACE, moving registrations into a separate macro (register_stardoc_proto_toolchains) seems reasonable to me.

The register_toolchains call in MODULE.bazel is the particularly problematic part as users won't even get to see it.

We discussed this in the Rules Authors SIG meeting and had the following ideas:

  1. Pregenerate the Java protobuf files and check them into the repo, with a test to verify that they are up-to-date. This is similar to how non-Bazel Go modules handle proto code generation and would avoid any proto actions at build time.
  2. Don't register the toolchain, but instead transition --extra_toolchains to include it only for the relevant Stardoc targets. Since Stardoc is a tool and doesn't end up on any runtime classpath, this would make it work out of the box without forcing the toolchain registration on downstream users.
tetromino commented 1 week ago

@fmeum - I am intrigued by the --extra_toolchains approach, but there are complications.

We need the --extra_toolchains on @stardoc//stardoc/proto:stardoc_output_java_proto. This means not just when the java_proto is used transitively from the stardoc rule, but also from the java targets and tests under @stardoc//src/...

I'm wondering if the right approach is to wrap stardoc_output_java_proto in a rule that applies the transition - or maybe even fork java_proto_library?..

fmeum commented 1 week ago

That sounds like one viable approach to me! It also avoids potential breakages through breaking changes in future protoc versions, at the cost of possibly downloading one more proto runtime jar.

tetromino commented 1 week ago

@fmeum - I have talked to @Wyverald offline today. He recommends leaving register_toolchains unconditionally in MODULE.bazel without protecting it by transitions or constraints - because it is the way toolchain registration is intended to be used. Toolchain resolution gives higher priority to toolchains registered closer to the root module in the dependency graph, so a dependency on stardoc will not cause a module's own preferred java protoc toolchain to be overridden. See https://bazel.build/extending/toolchains#toolchain-resolution.

For WORKSPACE users, we do need a toolchain registration macro, since there the toolchain hierarchy is flat.

fmeum commented 1 week ago

It's true that the BFS order of toolchain registrations with Bzlmod results in reasonable behavior if you consider just stardoc and the root module.

But what if another module totally_not_stardoc also starts to register proto toolchains for Java with a different (perhaps even incompatible) version of protoc and proto runtimes? If both are direct dependencies of the root module (which doesn't directly use protos), the first bazel_dep will win and we are back to the same kind of problem that plagued WORKSPACE: totally_not_stardoc may fail to build if and only if it comes last in BFS order due to stardoc registering a proto toolchain that misses a new feature relied upon by totally_not_stardoc.

It even gets more unpleasant for proto: The version of protoc needs to align with the version of the Java runtime, but both are represented by different toolchain types. If one is subject to MVS via a dep on toolchains_protoc and the other is "first in BFS order wins", this can lead to similar issues where the toolchains registered first aren't compatible.

To avoid having problematic WORKSPACE-style semantics reemerge embedded into proto toolchain registrations, we should look at the case of multiple modules relying on it more closely.