GoogleCloudPlatform / container-definitions

This repository contains Bazel targets for Google-maintained common container definitions and their dependencies.
Apache License 2.0
43 stars 24 forks source link

bazel container fails with "FATAL: mkdir('/tmp/build_output'): (error: 13): Permission denied" #12487

Open kyleg907 opened 3 years ago

kyleg907 commented 3 years ago

I'm trying to use the l.gcr.io/google/bazel:latest container as described by the Getting Started with Bazel Docker Container page. I'm suspecting this is a problem with the doc, but it might be a bug. I'm unfamiliar with the way this container is built and I'm unable to debug it further.

The command as documented fails with FATAL: mkdir('/tmp/build_output'): (error: 13): Permission denied

docker run \
  -e USER="$(id -u)" \
  -u="$(id -u)" \
  -v /tmp/src/workspace:/src/workspace \
  -v /tmp/build_output:/tmp/build_output \
  -w /src/workspace \
  l.gcr.io/google/bazel:latest \
  --output_user_root=/tmp/build_output \
  build //absl/...

(well, I tweaked from the doc a bit as having to create /src needs root - I did create the directories used here)

This command is dubious as the selinux labels are not taken into consideration. So one would expect problems reading or writing data at /src/workspace or /tmp/build_output anyway. Since /tmp/build_output is bind mounted into the container, it then exists and is a mount, it is not surprising that mkdir without -p would also fail. The problem might also be related to #4677.

edit: the mkdir must be bazel processing the --output_user_root parameter. The doc for docker run --output_user_root mentions the directory must not exist and be owned by the calling user. So this might be due to uid mapping with the container.

Tweaking commands works though..

Here we run internally in the container as root, and use the Z options to set the selinux labels. Here I assume the mkdir is still happening internally, running as root might work around the problem. This build works, though ownership of the files on the docker host will be an issue to deal with.

docker run \
  -v /tmp/src/workspace:/src/workspace:Z \
  -v /tmp/build_output:/tmp/build_output:Z \
  -w /src/workspace \
  l.gcr.io/google/bazel:latest \
  --output_user_root=/tmp/build_output \
  build //absl/...

# Build works!

Another option is to leave off the bind mount for /tmp/build_output. Not ideal, but you could still use docker cp or docker mount to get the stuff you need out of the container.

docker run \
  -e USER="$(id -u)" \
  -u="$(id -u)" \
  -v /tmp/src/workspace:/src/workspace:Z \
  -w /src/workspace \
  l.gcr.io/google/bazel:latest \
  --output_user_root=/tmp/build_output \
  build //absl/...

# Build works!

So if we pick one of those (probably the root version), maybe this boils down to a doc issue. I'm willing to contribute an update.

Other things tried

I tried changing the bind mount to bind mount over /tmp (-v /tmp/build_output:/tmp), thinking the mkdir would work. But still got the error. Also used a real newly created volume (-v absl-workspace:/tmp/build_output \), and that also failed. Tried changing --output_user_root to a subdirectory of the bind mount, same error code 13.

I had set out to revisit intra-container permission problems I had seen last year like #4677. Since I am able to run the container now, perhaps the problem has been addressed. That or this mkdir problem is now hiding that problem in some cases.

I tested with the same results on RHEL 8.2 with podman 1.93 and on Ubuntu 20.04.2 with Docker 19.03.8. The bazel image was

l.gcr.io/google/bazel latest 5cac8433a9d7 51 years ago 1.62GB

jizusun commented 3 years ago

I have the same issue