bazelbuild / stardoc

Stardoc: Starlark Documentation Generator
Apache License 2.0
108 stars 45 forks source link

Expose stardoc() output files as runfiles #139

Closed philsc closed 2 years ago

philsc commented 2 years ago

There's currently a nuisance with the stardoc() rule that presents itself in rules_python:

$ git clone https://github.com/bazelbuild/rules_python.git
$ cd rules_python
$ git checkout d314e96aaab18f60df50400d61214f7c1d71b8e6
$ bazel run //docs:update
cp: cannot stat 'bazel-bin/docs/packaging.md_': No such file or directory
cp: cannot stat 'bazel-bin/docs/pip.md_': No such file or directory
cp: cannot stat 'bazel-bin/docs/pip_repository.md_': No such file or directory
cp: cannot stat 'bazel-bin/docs/python.md_': No such file or directory

A sample of the targets involved look like so:

$ bazel cquery --output=build //docs:update + //docs:packaging-docs
INFO: Invocation ID: 5fd7a652-0b0d-4827-98f5-c345b38b2178
INFO: Analyzed 2 targets (0 packages loaded, 0 targets configured).
INFO: Found 2 targets...
# /home/jenkins/repos/rules_python/docs/BUILD:153:10
sh_binary(
  name = "update",
  target_compatible_with = [],
  data = ["//docs:packaging-docs", "//docs:pip-docs", "//docs:pip-repository", "//docs:core-docs"],
  srcs = ["//docs:update.sh"],
)
# Rule update instantiated at (most recent call last):
#   /home/jenkins/repos/rules_python/docs/BUILD:153:10 in <toplevel>

# /home/jenkins/repos/rules_python/docs/BUILD:121:8
stardoc(
  name = "packaging-docs",
  target_compatible_with = [],
  input = "//python:packaging.bzl",
  deps = ["//docs:packaging_bzl"],
  out = "//docs:packaging.md_",
)
# Rule packaging-docs instantiated at (most recent call last):
#   /home/jenkins/repos/rules_python/docs/BUILD:121:8 in <toplevel>
# Rule stardoc defined at (most recent call last):
#   /bazel-cache/phil/bazel/_bazel_phil/adc54b5b09500e464f8a73095f3bd8e3/external/io_bazel_stardoc/stardoc/stardoc.bzl:110:15 in <toplevel>

The update target could instead reference the *.md_ files directly instead of referencing the stardoc() targets. But it's not obvious that this is the desired work flow. It feels like users should be able to depend on the stardoc() target instead of its predeclared output.

This patch fixes this by adding the predeclared outputs to the target's runfiles. That lets the rules_python doc update target work again.

$ bazel run //docs:update
'bazel-bin/docs/packaging.md_' -> 'docs/packaging.md'
'bazel-bin/docs/pip.md_' -> 'docs/pip.md'
'bazel-bin/docs/pip_repository.md_' -> 'docs/pip_repository.md'
'bazel-bin/docs/python.md_' -> 'docs/python.md'
philsc commented 2 years ago

Pinging code owners @brandjon and @tetromino

brandjon commented 2 years ago

Hm. According to the documentation for data,

The default outputs and runfiles of targets in the data attribute should appear in the *.runfiles area of any executable which is output by or has a runtime dependency on this target. [emphasis mine]

So you'd think since the stardoc target is used in a data attr, adding the output md to runfiles shouldn't matter when it's already in default outputs.

Then again, the update script is not accessing these files from the runfiles tree, it's looking for the build outputs in bazel-bin/. Still, when I look at the runfiles manifest of the update script, it only includes the update and update.sh executables, not the default outputs of the stardoc targets.

I do notice that the md files are absent from bazel-bin/ until I request them explicitly in a bazel invocation.

So it sounds like contrary to the documentation, sh_binary is not populating the runfiles tree with default outputs of its targets in data, and that this also causes them to not be built. This thwarts both the traditional way to access data files (runfiles), and the bazel-bin/ mechanism that the updater script is using.

Doing a quick search for open issues involving sh_binary and data, I see that this is indeed the case: bazelbuild/bazel#15043, with pending pull bazelbuild/bazel#15052.

I see no issue with working around this by having stardoc add the output to runfiles too.

philsc commented 2 years ago

You know, I've been writing explicit DefaultInfo instances for so long that I forgot that default outputs get captured automatically (or at least should be).

Really appreciate the review. I didn't realize that this is actually a bug in bazel. Thank you!