Symlink targets to bazel-bin even with multiple transitions

keith commented 4 years ago

In the case of rules_apple we recently enabled transitions for more cases. Previously in a project like this:

load("@build_bazel_rules_apple//apple:ios.bzl", "ios_unit_test")

objc_library(
    name = "foo",
    srcs = ["foo.m"],
)

ios_unit_test(
    name = "test",
    minimum_os_version = "10.0",
    deps = ["foo"],
)

When you built the test target, you would get these output symlinks:

% bazelisk build test
...
  bazel-bin/test
  bazel-bin/test.zip
...

With transitions enabled you get these:

% bazelisk build test
...
  bazel-bin/test
  bazel-out/applebin_ios-ios_x86_64-fastbuild-ST-7bf874b56ea0/bin/test.zip
...

This result is the same even if you enable the undocumented --use_top_level_targets_for_symlinks.

In this specific case I assume this is because the test shell script, and the actually test bundle are built for multiple configurations, but the same case applies if you built a macOS target and an iOS target in the same build.

I think it would be great if in the case that symlinks did not conflict across configurations, they were symlinked to a more easily guessable path in bazel-bin. This would help significantly with IDE integrations where instead you have to parse the build event protocol log, or otherwise discover these transition paths where previously you could rely on more stable paths.

What operating system are you running Bazel on?

macOS

What's the output of `bazel info release`?

release 3.7.0

gregestren commented 3 years ago

Hi @keith,

by "did not conflict", do you mean no instances of

bazel-bin/mypkg/mypath bazel-out/applebin_ios-ios_x86_64-fastbuild-ST-7bf874b56ea0/bin/mypkg/mypath

? i.e. if the different roots were merged there would be no overlap?

That's a fair request. The next step would be evaluating how hard that is to determine. I guess looking at which part of the code constructs the bazel-bin symlink and seeing how global its view of the output tree is at that point.

gregestren commented 3 years ago

Symlink creation point: https://github.com/bazelbuild/bazel/blob/7e8f86d60adc56159216c4fbb5b403ee8c5aec1c/src/main/java/com/google/devtools/build/lib/buildtool/ExecutionTool.java#L590
Possible context for determining artifact paths: https://github.com/bazelbuild/bazel/blob/75216c74470090c6f720814dbee239ae4102290e/src/main/java/com/google/devtools/build/lib/analysis/AnalysisResult.java#L123 (I'm not sure if the "top-level" qualifier implies a subset of all paths)

keith commented 3 years ago

if the different roots were merged there would be no overlap?

exactly, in the example I showed above it fell back to the full paths, but there wouldn't be conflicts for test.zip so if it was symlinked at the top level it would be fine, and IMO significantly improve UX

GevaZeichner commented 2 years ago

Hi @keith, was struggling to figure out why the different output until finally found this issue. Any workarounds for this? I'm using ios_static_framework and would like to use a hard-coded path to add it to a project.

gregestren commented 2 years ago

FYI I looked into this a few weeks ago. I optimistically thought we could make --use_top_level_targets_for_symlinks more robust. But my research showed it's not as simple as we'd like if we'd like to generally guarantee correctness. So I'm not sure how to proceed.

@GevaZeichner does --use_top_level_targets_for_symlinks help you? As mentioned earlier in this issue, it's not sufficient for all builds. But it does help some builds.

Re: the research I did, I have it scribbled down in random inaccessible locations. Would anyone like me to port it to a Github issue for collaborative brainstorming?

GevaZeichner commented 2 years ago

does --use_top_level_targets_for_symlinks help you? As mentioned earlier in this issue, it's not sufficient for all builds. But it does help some builds.

@gregestren When using that it seems that it reports supposedly checking/outputting to bazel-bin, but actually the file still gets to the same bazel-out/applebin.... output.

INFO: Found 1 target...
Target //Applications/Img:ImgFramework up-to-date:
  bazel-bin/Applications/Img/ImgFramework.zip

gregestren commented 2 years ago

Does bazel-bin not resolve? What that flag does is remap bazel-bin to whatever the target actually is. So I'd expect bazel-bin to actually be a symlink to bazel-out/applebin/....

GevaZeichner commented 2 years ago

Oh yes you're right, I didn't notice it changed the symlink. It is pointing to bazel-out/applebin/... when using that flag and only that target. When I used objc_library instead of ios_static_framework, my output and the bazel-bin symlink were at bazel-out/ios_x86_64-dbg/bin. I also have other targets that I need to use, and their output still goes to bazel-out/ios_x86_64-dbg/bin. If using the flag in conjunction with those targets, I get WARNING: cleared convenience symlink(s) bazel-bin, bazel-testlogs because their destinations would be ambiguous. I wonder if there's a way for having ios_static_framework output to bazel-out/ios_x86_64-dbg/bin and then it will match all the others.

gregestren commented 2 years ago

That's where it gets complicated. :p Let me try to paste my notes into another issue.

GevaZeichner commented 2 years ago

Another direction to maybe alleviating this issue might be stabilizing the output path, instead of using that hash in bazel-out/applebin_....-7bf874b56ea0. If the output can be relied on to always be the same, I can use that for my project.

photex commented 1 year ago

Ultimately, a stable output path for the products of a build is extremely important isn't it? Not every software shop in the world can bend the knee to whatever Bazel decides is proper in this regard; there must be a way to configure this and allow everything to end up in the desired stable location.

I work for a large commercial software company that builds several industry standard content creation applications. At the moment my recommendation to adopt Bazel is largely prevented from issues like these. Being able to decide where the ultimate output of a build goes is important for automated testing of the applications, release and installer packaging (these are desktop applications), as well as day to day dev and debugging (if my app and plugin targets each cause the content of bazel-bin to change then I can't iterate on it or even debug it properly).

Bazel has so many good reasons for all the things going on behind the scenes so I wouldn't say anything is broken here, but I feel like it's an oversight of a very common use case and need. I can't really think of another (common) build system that doesn't have levers to control this.

fmeum commented 1 year ago

@photex Just want to mention that the recommended, stable way to get the paths of the outputs of a target is bazel cquery --output=files <target>. Those paths use bazel-out and thus also work with bazel-bin cleared or changing.

gregestren commented 1 year ago

Hi @photex - I'm open to advocate for better ideas. But it's not clear to me what you're suggesting that wouldn't compromise build correctness.

Just to make sure we're all on the same page:

A given bazel invocation with a given bazel binary over a stable set of input files will produce a stable output path.
That stable path is bazel-out/..., which is indeed hard for a user to read or remember.
cquery --output=files, which @fmeum noted (and contributed), is a new way to retrieve those paths where you only have to remember the build target. Note that didn't exist when this bug was filed. That was added to help with this kind of problem.
bazel-bin has always been intended as a "convenience" symlink: it provides a clean, simple mapping when possible. But it's not powerful enough to cover every use case while preserving build correctness. There's no trivial way to extend that as a generic solution.

I like the idea of applying bazel-bin to more cases, like the one that inspired this bug. I sketched out my thoughts on https://github.com/bazelbuild/bazel/issues/15005 and realized there are some real complications to make that work, with different answers depending on how people want to use the feature.

I'd like to think cquery --output=files is a good solution for automated testing (not as much for interactive users, since it requires an extra command): it's a fairly straightforward way to avoid all this intrinsic complexity.

Yes, this is likely more of a complication in Bazel than other build systems. I think a lot of that is due to Bazel's stronger correctness requirements and build sharding & execution capabilities. Bazel has to handle some considerations other build systems don't to effectively support these capabilities.

Still, I agree it's great to provide clean support for the "common" use case even if things get complicated for uncommon use cases. I'd love to hear more examples of your day to day usage where this pops up. Or what levers you're thinking of: maybe there are levers Bazel could support without complications.

Or maybe there's some way to ignore some of the corner cases in https://github.com/bazelbuild/bazel/issues/15005 and still cover meaningful use cases.

photex commented 1 year ago

I've been chewing on this a bit because there are many cases, such as assembling files for testing, where this would be scriptable and should work with what is suggested here with a bit of elbow grease. But for day to day development it definitely breaks the workflow and I don't know that Bazel can accommodate the setup required yet so I'm going to write a lot about the setup I'm facing and maybe something useful can come out of it. :D

The application we develop is comprised of not only a suite of libraries/frameworks and the primary application, but a large assortment of plugins and data. Some of the frameworks we build are loaded at runtime, and some are loaded only when loading a plugin, etc.

Currently, our build places everything into a particular staging arrangement and we can run and debug quite easily thanks to this. There are also some files that developers can place into this staging area to control, that shouldn't be a part of the build or checked into version control (logging and tracing configs, feature gating options, developer only tweaks for certain debugging needs).

To give some examples:

OutputRoot
- <architecture:arm64/x86_64/x86_64_arm64>
- Packages (currently, this means "things from Conan")
- Projects (currently, this means external CMake binary directories)

The architecture directory is possible to skip, but because we support several combinations for MacOS in particular, no devs skip it. I think only our official release process skips it as it only builds universal binaries.

From here, the CMake portion of the build treats OutputRoot/<architecture> as it's build directory (CMAKE_BINARY_DIR). Portions of the build do not use CMake, and that is nearly everything still, from core frameworks to the application itself which have dedicated MSbuild and Xcode projects in various sub directories of the repo. All outputs from those builds land in OutputRoot/<architecture>/<config> where "config" is Debug or Release. All targets in the CMake build conform to the same layout.

And here is where we get to the important bit. The layout under <config> is what's important for day to day running and debugging workflows.

OutputRoot
- <arch>/<config>
- Plug-Ins
  - All bundles compiled and placed here and are found by the app
- OurApp.app
- All of the relevant and required Frameworks (for a release these get placed into the app, but when developing and targets get built indepedently the "Frameworks" folder in the app bundle is a symlink to ../../)
- And then this is the folder (ignored by git) where developers can place certain config files when needed.
- There are some other tools and apps that land here as well which are used by the application at runtime. For releases they are bundled with the app and unpacked to the system at first run.

Most of this setup isn't required if we were to switch to Bazel, the primary need is the Plug-Ins folder and a standard place to find the helper apps and developer configs. The symlink to Frameworks and so on and so forth is largely a legacy thing stemming from a couple of decades of Xcode and MSBuild. (Speaking of MSBuild, the layout on Windows is almost entirely different, but fundamentally we have the same requirements where plugin bundles and helper apps need to land in a stable and expected location relative to the application binary).

When it comes to actually building any of these things I wouldn't suggest changing how Bazel handles it. It's just that the results need to land someplace collectively so that people can run and debug the application and any plugins they're working on.

I'd just expect that bazel run //:OurApp to "do the right thing". And I'd expect to run with the debugger, or attach to the app and have things work (like being able to step into plugin code).

Currently this doesn't happen. If I have a plugin bundle and an application built together, only one of them ends up in bazel-bin and although both are someplace in bazel-out it's not a unified location for all plugins, and our application should not be running bazel when it starts up to find all it's plugins.

Currently we have a meta build driver command script thing. We have this because we have all these platform specific and legacy builds, conan, and then our attempts to migrate to CMake. We define targets and have post build scripts for things. It's a mess, and it's my motivation for exploring Bazel and trying to make a reasonable recommendation to adopt it.

My main questions after chewing on this:

We can definitely use bazel to find the outputs and copy them someplace, maybe symlink them. But would that break debugging? And how does that affect IDE workspaces?

Does the bazel query for output files work before a build? Are the paths stable? Even after pulling the latest changes?

Could we run a command to setup our common output directory with symlinks to everything ahead of time and then just work? (This sounds like it will be somewhat awkward to support Debug and Release in this way.)

Does it break Bazels correctness and efficiency to allow BUILD files to specify a final layout for targets after they are built? bazel-products, bazel-products/Plug-Ins, bazel-products/Frameworks, and so on.

gregestren commented 1 year ago

@photex this is a lot to chew on, so I'm sorry I haven't responded more quickly. I'm still digesting your feedback but a few quick responses to the above:

We used to have a rule called fileset that lets builders map files to arbitrary directories. I'd like to think that pattern could help here. fileset no longer exists and as I understand something like pkg_files sort of replaces it. But that's focused on packaging vs. simply rearranging build outputs. @aiuto @tjgq for more advice on this: what's best practice for users writing their own directory hierarchies to land final build outputs?
Bazel has plenty of IDE integration. I can try to loop in IDE experts for more perspective on this (maybe @comius ?)
$ bazel cquery --output=files may become stale after updating the repo state or changing Bazel versions. This could be as simple as a rule definition changing what transition it applies. If you only update source files, I'd expect the paths to stay stable.

Anyone I CC'ed: look particularly at the bottom of the last comment, below the line divider. That's a focused, concise summary of the questions we're discussing.

aiuto commented 1 year ago

@photex

This definitely sounds like something for rules_pkg. With pkg_files you have the ability to gather outputs from other rules and specify where they should map to in a virtual output tree. Then you can use the pkg_install feature to manifest that tree into existence. I've also been planing to add a "create the tree as a symlink forest" feature too.

For an example, see https://github.com/bazelbuild/rules_pkg/tree/main/examples/rich_structure Here we have a source tree that does not represent the final arrangement of files. It happens to create tar files at the end, but it could just as easily be the symlink tree.

photex commented 1 year ago

@aiuto that is really encouraging!

keith commented 1 year ago

related https://github.com/bazelbuild/bazel/pull/18854 but doesn't have any affect on this specifically

github-actions[bot] commented 2 months ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

jiawen commented 2 months ago

Not stale.

bazelbuild / bazel