`bazel aquery` should expand directories

dgp1130 commented 3 years ago

Description of the problem / feature request:

bazel aquery should expand directories.

Feature requests: what underlying problem are you trying to solve with this feature?

I was trying to leverage the expand directories feature of ctx.actions.arguments().add_all(), but was encountering bugs. I used bazel aquery to inspect the arguments and was confused when the directory was not expanded. I spent a solid hour debugging this before I eventually discovered that bazel aquery was lying to me. The directory was being expanded, this was just wasn't being shown in bazel aquery.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

We define two rules, simple_dir(), which generates a directory of input files, and expand_directories(), which uses the directory expansion feature in its command line execution.

# build_defs.bzl

def _simple_dir_impl(ctx):
    dir = ctx.actions.declare_directory(ctx.attr.name)

    ctx.actions.run_shell(
        command = """
            for file in "$@"; do
                cp ${{file}} {output_dir}/$(basename ${{file}})
            done
        """.format(
            output_dir = dir.path,
        ),
        arguments = [file.path for file in ctx.files.files],
        inputs = ctx.files.files,
        outputs = [dir],
    )

    return DefaultInfo(files = depset([dir]))

simple_dir = rule(
    implementation = _simple_dir_impl,
    attrs = {
        "files": attr.label_list(allow_files = True),
    },
)

def _expand_directories_impl(ctx):
    output = ctx.actions.declare_file("%s.txt" % ctx.attr.name)

    args = ctx.actions.args()
    args.add_all([ctx.file.dir])

    # Output all arguments to the generated file to confirm execution.
    ctx.actions.run_shell(
        command = """
            echo $@ > {output}
        """.format(output = output.path),
        arguments = [args],
        inputs = [ctx.file.dir],
        outputs = [output],
    )

    return DefaultInfo(files = depset([output]))

expand_directories = rule(
    implementation = _expand_directories_impl,
    attrs = {
        "dir": attr.label(
            mandatory = True,
            allow_single_file = True,
        ),
    },
)

Now we use them:

# BUILD.bazel

load(":build_defs.bzl", "expand_directories", "simple_dir")

# Generate a directory of two files.
simple_dir(
    name = "dir",
    files = [
        "entry1.txt",
        "entry2.txt",
    ],
)

# Execute a command with the directory expanded.
expand_directories(
    name = "expansion",
    dir = ":dir",
)

Now let's query the action:

$ bazel aquery //:expansion
action 'Action expansion.txt'
  # ...
  Command Line: (exec /bin/bash \
    -c \
    '
            echo $@ > bazel-out/k8-fastbuild/bin/expansion.txt
        ' \
    '' \
    bazel-out/k8-fastbuild/bin/dir)

Note that only one command line argument is listed: bazel-out/k8-fastbuild/bin/dir.

But what if we execute the build:

$ bazel build //:expansion && cat bazel-out/k8-fastbuild/bin/expansion.txt
# ...
bazel-out/k8-fastbuild/bin/dir/entry1.txt bazel-out/k8-fastbuild/bin/dir/entry2.txt

Both files were passed as individual arguments! bazel aquery should have shown both of these files in its command line, or at least indicated that the directory is expanded: <expanded directory bazel-out/k8-fastbuild/bin/dir>.

Note that if I add exit 1 to the end of the expand_directories() command and run with --verbose_failures, I do get a more honest logged command:

(cd /home/dparker/.cache/bazel/_bazel_dparker/bffdf2c425f6a35c1ed4af0a0628b396/sandbox/linux-sandbox/253/execroot/build_bazel_rules_nodejs && \
  exec env - \
    TMPDIR=/tmp \
  /home/dparker/.cache/bazel/_bazel_dparker/install/4c21e876de560d3ed6010a2e5e2f9074/linux-sandbox -t 15 -w /home/dparker/.cache/bazel/_bazel_dparker/bffdf2c425f6a35c1ed4af0a0628b396/sandbox/linux-sandbox/253/execroot/build_bazel_rules_nodejs -w /tmp -w /run/shm -D -- /bin/bash -c '
            echo $@ > bazel-out/k8-fastbuild/bin/expansion.txt
            exit 1
        ' '' bazel-out/k8-fastbuild/bin/dir/entry1.js bazel-out/k8-fastbuild/bin/dir/entry2.js)

What operating system are you running Bazel on?

Ubuntu 20.04 via WSL2 on Windows 10.

What's the output of `bazel info release`?

$ bazel info release
INFO: Invocation ID: 649ddb3b-8152-4aab-9e60-1f9fd84c6571
release 3.6.0

Have you found anything relevant by searching the web?

Google searches for "bazel actions add_all does not expand directories" didn't come up with much. Searches for "bazel expand directories aquery" also did not yield much.

github-actions[bot] commented 1 year ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

dgp1130 commented 1 year ago

I'm definitely still interested in seeing this fixed. To me, it seems like a bug in aquery, since it's printing an objectively wrong command.

tjgq commented 1 year ago

It's a little tricky to fix this; aquery by design only runs the analysis phase, but the contents of tree artifacts are only known after the execution phase (more to the point: we'd have to actually run the action that produces the tree artifact, which in turn would require us to run the actions it depends on). The --subcommands flag might be more helpful in this case.

I wonder if we should more clearly mark the directories as a lie in the aquery output.

github-actions[bot] commented 4 months ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

dgp1130 commented 3 months ago

Limiting bazel aquery to the analysis phase seems reasonable. Can I suggest a slight alternative of generating an executable command which dynamically expands the directory? Something like:

$ bazel aquery //:expansion
action 'Action expansion.txt'
  # ...
  Command Line: (exec /bin/bash \
    -c \
    '
            echo $@ > bazel-out/k8-fastbuild/bin/expansion.txt
        ' \
    '' \
    $(bazel-out/tools/expand-tree bazel-out/k8-fastbuild/bin/dir))

Here I added $(bazel-out/tools/expand-tree ${dir}) which would list all the files in the directory and print their full execpaths so this command expands to:

$ bazel aquery //:expansion
action 'Action expansion.txt'
  # ...
  Command Line: (exec /bin/bash \
    -c \
    '
            echo $@ > bazel-out/k8-fastbuild/bin/expansion.txt
        ' \
    '' \
    bazel-out/k8-fastbuild/bin/dir/entry1.js bazel-out/k8-fastbuild/bin/dir/entry2.js)

This would allow us to stay constrained to the analysis phase, but also copy-pasting the command line will run the command identically to the build and align with --subcommands, even if it needs some indirection to get there.

The two challenges I can think of are:

We need to make sure expand-tree lists the files in the correct order. I assume the order already needs to be stable so this is probably doable.
We need to make sure expand-tree is available and built. It's not normally a dependency of this target, so even if //tools/expand-tree somehow exists in the build graph, it won't be built. I'm not sure if @bazel_tools can help here or exactly how that fits in. Maybe we can ship this binary in there somehow?
On Windows hosts we'd need to rewrite this command to work with cmd / PowerShell / whatever, which might be tricky to make fully xplat compatible. My example already has exec /bin/bash, so I feel like Bazel must already be doing that here to some extent?

tjgq commented 2 months ago

Can I suggest a slight alternative of generating an executable command which dynamically expands the directory?

If you mean a command you can run in substitution of an actual build, I don't think this will work. In the general case there's an entire action (sub-)graph that needs to run to get to the point where the action that produces the directory can itself run, i.e., expand-tree is basically equivalent to bazel build.

If you mean a command you can run after the build, it's literally just a script to recursively list the directory contents; I don't see much value in Bazel providing it.

dgp1130 commented 2 months ago

I was mainly looking to debug the command being executed. I agree building the directory is probably out of scope and equivalent to bazel build-ing it. My goal here is:

Ensure bazel aquery doesn't mislead users. I argue the current behavior is actively lying to users about it's arguments, which is worse than just not supporting tree artifacts.
Allow users to reproduce the command outside of a Bazel context.

You're right that my proposed expand-tree doesn't do much useful, though I don't think it strictly needs to. Maybe Bazel just prints a find command. As long as the output is the same was what Bazel will provide during execution (file order might be important), that's fine.

Even if this isn't worth an extra expand-tree tool and we can't trivially do the same thing with find, we could at least stop misleading users by changing the syntax to be clear it isn't directly executable. Something like:

$ bazel aquery //:expansion
action 'Action expansion.txt'
  # ...
  Command Line: (exec /bin/bash \
    -c \
    '
            echo $@ > bazel-out/k8-fastbuild/bin/expansion.txt
        ' \
    '' \
    BAZEL_EXPANDED_TREE_OF_bazel-out/k8-fastbuild/bin/dir)

Unfortunately it's tricky to communicate that this isn't executable code and what the equivalent should be, which is why I think an expand-tree or find subcommand would be preferred.

tjgq commented 2 months ago

On second thought, I think I've misrepresented the complexity of expand-tree above. In the general case, it's not simply a matter of listing the directory contents, because you can write relatively arbitrary Starlark code to transform them. For example:

def _mapper(f):
  if f.path.endswith(".txt"):
    # Omit from command line
    return None 
  else:
    # Convert to uppercase and prefix with flag name
    return "--foo={}".format(f.path.toupper())
...
d = ctx.actions.declare_directory("dir")
...
args = ctx.action.args()
args.add_all([d], map_each = _mapper)

I'm not claiming that this particular example is realistic, just making a point that expand-tree would need to contain a Starlark interpreter to be generally applicable.

tjgq commented 2 months ago

I think it makes sense to communicate to users that the command line is unreliable, either by mangling it into something that clearly won't work (such as a BAZEL_EXPANDED_TREE_OF_ prefix), or by adding an additional is_command_line_reliable field to the aquery output. I'm less sure that we should implement expand-tree; it feels like an awful lot of work for something that requires you to actually run the build to be useful (at which point you might as well use -s).

dgp1130 commented 2 months ago

I don't remember trying -s, if that prints the correct output, then I agree it might be better to point users at that. I can see what you mean that since this is dependent on Starlark the mapping information is lost and needs to reanalyze the build. I suspect a Starlark interpreter in expand-tree wouldn't be enough? You'd need to bazel build from scratch to get the right inputs.

Edit: Actually map_each doesn't support closures right? So maybe you could get away with just parsing the file. Is there any way that could depend on content outside the file? Maybe calling a load(...) statement loaded with --package_path? An extreme example for sure, but I suspect using an interpreter directly might come with it's own limitations.

The one alternative is to implement something simple like find for cases where map_each is not used on the assumption that this is most cases and at least provides partial support. My intuition is that most tree artifact-consuming actions don't use that particular feature, but I don't have any data to back that up. If so, we could at least give partial support for the simple case.

is_command_line_reliable feels too easy to overlook IMHO. I recommend having something which is invalid Bash in the display command to make clear that it's not a real executable statement and cause an error for users who attempt to run it.

tjgq commented 2 months ago

I don't remember trying -s, if that prints the correct output, then I agree it might be better to point users at that.

Yes, -s will print a fully working command line, with the caveat that it will use the execroot as the working directory, which is non-hermetic; the actual action might run inside a sandbox, which -s doesn't recreate (and neither does aquery).

I can see what you mean that since this is dependent on Starlark the mapping information is lost and needs to reanalyze the build. I suspect a Starlark interpreter in expand-tree wouldn't be enough? You'd need to bazel build from scratch to get the right inputs.

You always need to run bazel build to establish the precondition that the action inputs exist and are up to date. The question is whether you then run the command line indicated by -s or the one provided by aquery. The only thing expand-tree could reasonably do (since running a build is unreasonable) is run the Starlark expansion logic on the files that are present in the directory to produce a command line equivalent to -s.

dgp1130 commented 2 months ago

I meant "to get the right inputs [for the map_each]". What happens when:

load("//path/to/pkg:defs.bzl", "map_file")

def _mapper(file):
    return map_file(file)

def _some_rule_impl(ctx):
    dir = ctx.actions.declare_directory(ctx.attr.name)

    args = ctx.actions.args()
    args.add_all(dir, map_each = _mapper)

    # ...

Then if you build with something like --package_path which contains path/to/pkg:defs.bzl:

$ bazel build --package_path /some/other/path //foo

So to be correct, I would think expand-tree would also need that --package_path arg and makes me think that even evaluating the Starlark isn't that far off from a full analysis from bazel build. Not sure if other options might affect it too.

Regardless, I agree implementing expand-tree to be fully correct for the map_each case is likely infeasible. I think the options here are:

Update bazel aquery command line to be clear that the result is not a truly equivalent command line execution.
Update bazel aquery command line to print a correct version with find or expand-tree for "simple" cases (no map_each).
~Update bazel aquery with expand-tree for map_each~ - INFEASIBLE.
Do nothing and accept the inconsistency.

I feel like 1. is probably easy enough to be worth doing, it's mainly a question of what the right syntax is to best communicate this limitation. 2. is potentially worth it if we can confirm that most args.add_all invocations on TreeArtifact objects do not use map_each, not sure if we have any good data or intuition on that point (I suspect google3 metrics might be skewed since TreeArtifact was relatively recently added)?

bazelbuild / bazel