bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.2k stars 4.06k forks source link

Allow unresolved symlinks inside a tree artifact #15454

Open GWLeal opened 2 years ago

GWLeal commented 2 years ago

Hello,

I've noticed some strange behavior regarding how bazel evaluates invalid symlinks.

UC1: When I declare an invalid symlink, in a custom rule, and I build a bazel target of that kind bazel prints out the message: Error in declare_symlink: actions.declare_symlink() is not allowed; use the --experimental_allow_unresolved_symlinks command line option As expected when the suggested option is added the target builds successfully.

UC2: However when I declare a folder and add a invalid symlink in that folder I get: ERROR: (...): Failed to resolve relative path invalid_symlink inside TreeArtifact (...) The associated file is either missing or is an invalid symlink. Adding the --experimental_allow_unresolved_symlinks does not work in this case.

This seems like inconsistent behavior. Also it is pretty common to pack build artifacts that contain invalid symlinks that will only become valid when the package is installed on the intended target. I looked in the documentation and did not find a way to circumvent the issue.

I prepared a minimal working example to replicate the problem: https://github.com/GWLeal/bazel_invalid_symlinks_ex

UC1: bazel build //test:symlink_test

UC2: bazel build //test:dir_symlink_test

I'm currently using bazel version 5.1.1

Thank you in advance for taking the time to look into this :)

lberki commented 2 years ago

I'm not too surprised. --experimental_allow_unresolved_symlinks only works for "regular" artifacts not those that are part of TreeArtifacts. This is one of the reasons why it's still experimental...

/cc @coeuvre @alexjski

@coeuvre maybe this is something that could conceivably be fixed as a drive-by fix while you are poking at metadata handling for remote execution? (I dorealize this is a long shot)

alexjski commented 2 years ago

--experimental_allow_unresolved_symlinks

What is the use case for having unresolved symlinks?

GWLeal commented 2 years ago

--experimental_allow_unresolved_symlinks

What is the use case for having unresolved symlinks?

@alexjski in this case we generate an image with the application filesystem that is mounted in one of the target's partitions as read-only. One of the symlinks points to a file in the OS partition in the target. But I would say that having unresolved symlinks in the context of packaging is common.

(thank you for the replying so quickly)

lberki commented 2 years ago

After some discussion with @lfpino we came to the conclusion that there are two missing pieces before we can stabilize this:

  1. Support on Bazel's RE interface (this is known to not work)
  2. Support in various sandboxes (code suggests that this works, but I don't know off the top of my head if it does or it is tested)

I specifically don't want to add any requirement concerning Windows because symlinks on Windows are a shaky edifice anyway that is not used widely so I don't think it makes sense for Bazel to handle symlinks on Windows better than Windows itself.

So if @lfpino fixes the above two issues, I see no obstacle to stabilizing this functionality (the flag will stay since we need to disable it at Google, but it can default to true and the --experimental_* prefix can be removed)

Since I won't be able to attend to this in the near future, @coeuvre graciously agreed to oversee the process and review the code.

fmeum commented 2 years ago

I have RBE support implemented on a branch, which I can submit for review after https://github.com/bazelbuild/bazel/pull/15700 and https://github.com/bazelbuild/bazel/pull/15773. Specifically, the tests I added in https://github.com/bazelbuild/bazel/pull/15700 seems to indicate that everything works pretty well on Windows (except possibly the Windows sandbox, which I haven't tested with yet).

@lfpino Let me know if you want to sync on this.

lfpino commented 2 years ago

@fmeum I've only started looking at this recently, I'd definitely love to meet and chat about this.

ismell commented 2 years ago

I have RBE support implemented on a branch, which I can submit for review after #15700 and #15773. Specifically, the tests I added in #15700 seems to indicate that everything works pretty well on Windows (except possibly the Windows sandbox, which I haven't tested with yet).

@lfpino Let me know if you want to sync on this.

Looks like the two issues stated have been fixed. What's the status of the RBE branch?

fmeum commented 2 years ago

@ismell --experimental_allow_unresolved_symlinks should just work with RBE with Bazel HEAD. The flag is planned to be flipped for Bazel 6. I don't think that support covers unresolved symlinks in tree artifacts though.

ismell commented 2 years ago

Ah, I was looking for the unresolved symlinks in tree artifacts support. Is that still being worked on or has it been shelved?

fmeum commented 2 years ago

I don't know, but I think @larsrc-google does.

larsrc-google commented 2 years ago

Unresolved symlinks in tree artifacts are not currently being worked on, to the best of my knowledge.

lberki commented 2 years ago

@larsrc-google correct; we had a discussion with @tjgq and the outcome was that it's highly non-trivial how to handle them, because technically, each file in tree artifacts could require different treatment (Bazel could either resolve any symlink or not) and there isn't an obvious way to tell Bazel about each file what to do.

The simplest way to resolve that to make the decision at the level of tree artifact (i.e. either every symlink is resolved or every symlink is unresolved), but making that behavior consistent requires some thought about how it works with remote execution and maybe changes to the protocol which we didn't have time for for Bazel 6.0 so we decided to shelve the problem for now.

larsrc-google commented 2 years ago

I'm adding this to the docs in my CL removing experimental_.

comius commented 1 year ago

I changed the team, because the issue is more concerned with the execution phase than it is with Rules API. Local exec seemed to be the closest pick, although probably not precise one.

tjgq commented 1 year ago

@comius The feature request here is to support unresolved symlinks inside a tree artifact; the current semantics are to transparently resolve symlinks into the files they point to.

I don't think this is purely an "execution phase" change, because it would affect the way artifacts are tracked by Skyframe, and require some sort of API to opt into it: either a flag to change the semantics for every tree artifact, or some way to toggle it on a per-tree or per-file-inside-tree basis. This seems within the purview of Build API (and perhaps even Core, given the Skyframe implications).

lberki commented 1 year ago

I think it's all of the above: one would need to

aherrmann commented 7 months ago

I ran into this issue while updating rules_zig to support the upcoming Zig 0.12.0 release (as of 0.12.0-dev.3366+8e7d9afda). The Zig compiler intentionally generates a dangling symlink to store metadata in its global cache directory. rules_zig currently treats Zig's cache directories as an output, see https://github.com/aherrmann/rules_zig/issues/87.

With Bazel 7.0.2 this behavior raises errors of the form

ERROR: /home/aj/src/rules_zig/main/e2e/workspace/include-dependencies/zig-std-include/BUILD.bazel:5:11: Error while validating output TreeArtifact File:[[<execution_root>]bazel-out/k8-fastbuild/bin]include-dependencies/zig-std-include/.zig-cache/global/binary : Failed to resolve relative path o/8c1a455b9d0322528743f00b10ab8b93/lld.id inside TreeArtifact /home/aj/.cache/bazel/_bazel_aj/0deefcb713c28fbc7261691ef6069b08/execroot/_main/bazel-out/k8-fastbuild/bin/include-dependencies/zig-std-include/.zig-cache/global/binary. The associated file is either missing or is an invalid symlink.

With Bazel 7.1.0 this behavior raises a very confusing error of the form

ERROR: /home/aj/src/rules_zig/main/e2e/workspace/include-dependencies/zig-std-include/BUILD.bazel:5:11: TreeArtifact include-dependencies/zig-std-include/.zig-cache/global/binary was not created
ERROR: /home/aj/src/rules_zig/main/e2e/workspace/include-dependencies/zig-std-include/BUILD.bazel:5:11: Building include-dependencies/zig-std-include/main.zig as Zig binary bazel-out/k8-fastbuild/bin/include-dependencies/zig-std-include/binary failed: not all outputs were created or valid
tjgq commented 7 months ago

This changed in 5506a0fa81c8f122ee635f9866d891ac885051ef which was cherry-picked as https://github.com/bazelbuild/bazel/commit/1d4660496e15143afa11f840225b11dd0dc877eb. Let me send a CL to restore the clearer error message.

[EDIT: above is incorrect; the commit that caused the error message to change was actually 4247c20f1daeeb787e9b5cb64848d398467effb1]

aherrmann commented 7 months ago

@tjgq Thank you! I tested Bazel as of commit b78d73fab1f01e2e8ae29bed667865a42800b626 and can confirm that the error message is improved:

ERROR: /home/aj/src/rules_zig/main/e2e/workspace/include-dependencies/zig-std-include/BUILD.bazel:5:11: error while validating output tree artifact include-dependencies/zig-std-include/.zig-cache/global/binary: child o/8c1a455b9d0322528743f00b10ab8b93/lld.id is a dangling symbolic link
ERROR: /home/aj/src/rules_zig/main/e2e/workspace/include-dependencies/zig-std-include/BUILD.bazel:5:11: Building include-dependencies/zig-std-include/main.zig as Zig binary include-dependencies/zig-std-include/binary failed: not all outputs were created or valid
Target //include-dependencies/zig-std-include:binary failed to build
Wyverald commented 6 months ago

@coeuvre @tjgq What's the current status of this issue? Is it fixed/mitigated? Are we still targeting 7.2.0 for a fix?

tjgq commented 6 months ago

@Wyverald This issue is a feature request that we haven't yet decided to implement. The sub-discussion started by https://github.com/bazelbuild/bazel/issues/15454#issuecomment-2011456094 is about a confusing error message, which has been fixed at head (b78d73fab1f01e2e8ae29bed667865a42800b626) and cherry-picked into 7.2.0 (86ff365a80810aaefb35279e003bbe9e55b635e5).