bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.3k stars 4.1k forks source link

Bazel@HEAD crashes if XDG_CACHE_HOME starts with `~/` #21660

Closed fmeum closed 8 months ago

fmeum commented 8 months ago

Description of the bug:

Since https://github.com/bazelbuild/bazel/commit/05ae91f1a04b55af94aaa7b52ef34cb3a6ce7fa4, if XDG_CACHE_HOME starts with ~/, the Bazel server crashes at startup (GitHub Actions log follows):

2024-03-12T00:16:26.1491466Z ##[group]Run bazel --bazelrc=/home/runner/work/with_cfg.bzl/with_cfg.bzl/.github/workflows/ci.bazelrc --bazelrc=.bazelrc test //...
bazel --bazelrc=/home/runner/work/with_cfg.bzl/with_cfg.bzl/.github/workflows/ci.bazelrc --bazelrc=.bazelrc test //...
shell: /usr/bin/bash -e {0}
env:
  XDG_CACHE_HOME: ~/.cache/bazel-repo
2024/03/12 00:16:29 Using unreleased version at commit 73e8a9e18a8638abd04e7a895e27d62ccfb2f549
2024/03/12 00:16:29 Downloading https://storage.googleapis.com/bazel-builds/artifacts/centos7/73e8a9e18a8638abd04e7a895e27d62ccfb2f549/bazel...
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: Reading 'startup' options from /home/runner/work/with_cfg.bzl/with_cfg.bzl/examples/.bazelrc: --nowindows_enable_symlinks
Server crashed during startup. Now printing /home/runner/work/with_cfg.bzl/with_cfg.bzl/examples/~/.cache/bazel-repo/bazel/_bazel_runner/0e0ab7529d19bd6a77ace5bd20443459/server/jvm.out
OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
Cannot enumerate embedded binaries: /home/runner/.cache/bazel-repo/bazel/_bazel_runner/install/7d0c5e353f9b55f289576ab6aa36b737 (No such file or directory)
com.google.devtools.build.lib.util.AbruptExitException: Cannot enumerate embedded binaries: /home/runner/.cache/bazel-repo/bazel/_bazel_runner/install/7d0c5e353f9b55f289576ab6aa36b737 (No such file or directory)
    at com.google.devtools.build.lib.runtime.BlazeRuntime.createFilesystemExitException(BlazeRuntime.java:1320)
    at com.google.devtools.build.lib.runtime.BlazeRuntime.newRuntime(BlazeRuntime.java:1307)
    at com.google.devtools.build.lib.runtime.BlazeRuntime.serverMain(BlazeRuntime.java:1030)
    at com.google.devtools.build.lib.runtime.BlazeRuntime.main(BlazeRuntime.java:774)
    at com.google.devtools.build.lib.bazel.Bazel.main(Bazel.java:97)
Caused by: java.io.FileNotFoundException: /home/runner/.cache/bazel-repo/bazel/_bazel_runner/install/7d0c5e353f9b55f289576ab6aa36b737 (No such file or directory)
    at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(Native Method)
    at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(NativePosixFiles.java:202)
    at com.google.devtools.build.lib.unix.UnixFileSystem.readdir(UnixFileSystem.java:185)
    at com.google.devtools.build.lib.vfs.Path.readdir(Path.java:267)
    at com.google.devtools.build.lib.exec.BinTools.scanDirectoryRecursively(BinTools.java:132)
    at com.google.devtools.build.lib.exec.BinTools.forProduction(BinTools.java:80)
    at com.google.devtools.build.lib.runtime.BlazeRuntime.newRuntime(BlazeRuntime.java:1305)
    ... 3 more
Process completed with exit code 37.

Java IO functions appear to resolve ~ while Bazel's native filesystem routines don't.

Which category does this issue belong to?

Core

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

Yes, caused by https://github.com/bazelbuild/bazel/commit/05ae91f1a04b55af94aaa7b52ef34cb3a6ce7fa4.

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

fmeum commented 8 months ago

Cc @tetromino

FYI @alexeagle as this seems to affect all repos using the bazel-contrib ruleset template.

tetromino commented 8 months ago

Funny bug - a new variation on the theme of https://github.com/bazelbuild/bazel/pull/11987

Let me think how to fix this in the least ugly way...

tetromino commented 8 months ago

I think the right behavior is to not shell-expand ~ in standard env variables like HOME (which would be an expansion bomb) and XDG_CACHE_HOME. We already take care to not expand ~ in TEST_TMPDIR, and most tools do not shell-expand ~ in paths coming from env variables.

tetromino commented 8 months ago

@bazel-io fork 7.2.0

fmeum commented 8 months ago

The commit fixes the crash in the affected repo. Since the relative path to the output root results in Bazel traversing the root and thus complaining about an infinite symlink expansion, having XDG_CACHE_HOME start with a ~ will unfortunately still not work (log below). The error is especially cryptic since Bazel naturally prints paths starting with ~, which obscures the fact that they are interpreted as relative. Maybe failing with a descriptive error in this case would be better?

INFO: Reading rc options for 'test' from /home/runner/work/with_cfg.bzl/with_cfg.bzl/.github/workflows/ci.bazelrc:
  Inherited 'build' options: --announce_rc --disk_cache=~/.cache/bazel
INFO: Reading rc options for 'test' from /home/runner/work/with_cfg.bzl/with_cfg.bzl/.github/workflows/ci.bazelrc:
  'test' options: --test_output=errors --test_env=XDG_CACHE_HOME
Computing main repo mapping: 
Computing main repo mapping: 
WARNING: For repository 'rules_java', the root module requires module version rules_java@7.3.1, but got rules_java@7.4.0 in the resolved dependency graph. Please update the version in your MODULE.bazel or set --check_direct_dependencies=off
Loading: 
Loading: 0 packages loaded
ERROR: infinite symlink expansion detected
[start of symlink chain]
/home/runner/work/with_cfg.bzl/with_cfg.bzl
/home/runner/work/with_cfg.bzl/with_cfg.bzl/examples/~/.cache/bazel-repo/bazel/_bazel_runner/0e0ab7529d19bd6a77ace5bd20443459/external/with_cfg.bzl~
[end of symlink chain]
ERROR: Infinite symlink expansion, for ~/.cache/bazel-repo/bazel/_bazel_runner/0e0ab7529d19bd6a77ace5bd20443459/external/with_cfg.bzl~, skipping: Infinite symlink expansion
ERROR: error loading package under directory '': error loading package '~/.cache/bazel-repo/bazel/_bazel_runner/0e0ab7529d19bd6a77ace5bd20443459/external/bazel_tools/tools/android': Unable to find package for @@[unknown repo 'rules_python' requested from @@]//python:defs.bzl: The repository '@@[unknown repo 'rules_python' requested from @@]' could not be resolved: No repository visible as '@rules_python' from main repository.
INFO: Elapsed time: 4.932s
INFO: 0 processes.
ERROR: Build did NOT complete successfully
ERROR: Couldn't start the build. Unable to run tests
Process completed with exit code 1.
tetromino commented 8 months ago

@fmeum - I think it is a user error to set XDG_CACHE_HOME to the literal string "\~/something" and expect the "~" to be expanded by tools downstream. If you set XDG_CACHE_HOME to the literal string "$(cat /etc/passwd)", surely you would not expect tools to shell-expand it and print the contents of /etc/passwd to logs...

fmeum commented 8 months ago

I agree that it's a user error and I also think that this was the right way to fix it, I just wish the error messaging was clearer in this case. I'm also not sure how to improve it though.

iancha1992 commented 6 months ago

A fix for this issue has been included in Bazel 7.2.0 RC1. Please test out the release candidate and report any issues as soon as possible. If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.2.0rc1. Thanks!