google / bazel-common

Common functionality for Google's open-source libraries that are built with bazel.
Apache License 2.0
84 stars 40 forks source link

More than one javadoc_library rule ends up producing "merged" output #153

Closed niloc132 closed 2 years ago

niloc132 commented 2 years ago

If a project has more than one javadoc_library rule in it, one may end up sharing the other's contents to some degree, if --spawn_strategy=local is set, such as in the .bazelrc file.

As long as each invocation of the rule is run in series, there will be no obvious difference from viewing the HTML results - the index.html etc will only point to the expected classes. In theory there could be obvious bugs if the rules are run in parallel, but I haven't seen this occur. The bug is obvious though when inspecting the contents of the javadoc jar.

Example BUILD file:

javadoc_library(
  name = "foo",
  srcs = ["Foo.java"],
)
javadoc_library(
  name = "bar",
  srcs = ["Bar.java"],
)

Where a trivial java file exists for each of these two classes. You would expect to find that each jar contains only the class it references, but instead one will contain both. This will happen even if the rules are declared in different BUILD files in different directories, but different invocations of bazel seem to not share contents.

I have also seen that if the tmp/ directory already exists in the root directory of the project, its contents can be merged into the output javadoc jar, but I can't consistently reproduce that.


I believe this is a bug in the javadoc_library rule, it is not correctly declaring the output it creates and keeping it as "hermetic" as it should. Presently, all output is written to a tmp/ directory, which happens to be in the root of the project (regardless of where bazel is invoked from, or where the rule itself is declared):

https://github.com/google/bazel-common/blob/bf8e5ef95b118d1716b0cb4982cf15b6ed1c896f/tools/javadoc/javadoc.bzl#L42

Instead, this directory should probably be in the same directory as the output. My understanding of bazel is that this idea could look something like:

    tmp = ctx.actions.declare_directory("%s_javadoc" % ctx.attr.name)

    javadoc_command = [
        java_home + "/bin/javadoc",
        "-use",
        "-encoding UTF8",
        "-classpath",
        ":".join([jar.path for jar in classpath]),
        "-notimestamp",
        "-d %s" % tmp.path, # point to the newly created direc
        "-Xdoclint:-missing",
        "-quiet",
    ]

In turn, the later jar command would need to consume this new file:

    jar_command = "%s/bin/jar cf %s -C %s ." % (java_home, ctx.outputs.jar.path, tmp.path) 

and the action declare this directory as an output:

    ctx.actions.run_shell(
        inputs = srcs + classpath + ctx.files._jdk,
        command = "%s && %s" % (" ".join(javadoc_command), jar_command),
        outputs = [ctx.outputs.jar, tmp], # new output declared here
    )