docker / buildx

Docker CLI plugin for extended build capabilities with BuildKit
Apache License 2.0
3.6k stars 485 forks source link

docker-container driver: `COPY --link --chown` unexpectedly changes parent dir ownership #1855

Open polarathene opened 1 year ago

polarathene commented 1 year ago

Contributing guidelines

I've found a bug and checked that ...

Description

Originally reported at BuildKit: https://github.com/moby/buildkit/issues/3912


Scenario: Adding content from a different image on DockerHub via COPY --link, and correcting the ownership with --chown=<uid>.

When building with the buildx driver docker-container, the parent directories (/var/lib) appear to also have had their ownership modified to the --chown value.

It's possible that this bug is related to how the feature(s) work when pulling an image from a registry, or when building an image locally with --load via the docker-container driver. The native docker driver does not build images with this ownership bug.

Expected behaviour

/var and /var/lib should not have their ownership changed since they already exist in the image, only /var/lib/clamav was copied over.

Actual behaviour

Parent directories ownership is changed from the COPY --link --chown.

Buildx version

github.com/docker/buildx v0.10.4 c513d34

Docker info

docker info ``` Client: Docker Engine - Community Version: 24.0.1 Context: default Debug Mode: false Plugins: buildx: Docker Buildx (Docker Inc.) Version: v0.10.4 Path: /usr/libexec/docker/cli-plugins/docker-buildx compose: Docker Compose (Docker Inc.) Version: v2.18.1 Path: /usr/libexec/docker/cli-plugins/docker-compose Server: Containers: 3 Running: 3 Paused: 0 Stopped: 0 Images: 25 Server Version: 24.0.1 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Using metacopy: false Native Overlay Diff: true userxattr: false Logging Driver: json-file Cgroup Driver: systemd Cgroup Version: 2 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: io.containerd.runc.v2 runc Default Runtime: runc Init Binary: docker-init containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8 runc version: v1.1.7-0-g860f061 init version: de40ad0 Security Options: apparmor seccomp Profile: builtin cgroupns Kernel Version: 6.2.0-20-generic Operating System: Ubuntu 23.04 OSType: linux Architecture: x86_64 CPUs: 1 Total Memory: 945.4MiB Name: vpc-ubu ID: 028ce824-5aaf-4d0b-97cc-c31018736f0f Docker Root Dir: /var/lib/docker Debug Mode: false Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false ```

Builders list

NAME/NODE DRIVER/ENDPOINT             STATUS  BUILDKIT PLATFORMS
con *     docker-container                             
  con0    unix:///var/run/docker.sock running v0.11.6  linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386
default   docker                                       
  default default                     running v0.11.6  linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386

Configuration

# syntax=docker.io/docker/dockerfile:1

FROM docker.io/debian:11-slim
RUN adduser --quiet --system --group --disabled-password --home /var/lib/clamav --no-create-home --uid 200 clamav
COPY --link --chown=200 --from=docker.io/clamav/clamav:latest /var/lib/clamav /var/lib/clamav
$ docker buildx create --driver docker-container --name con --use
$ docker buildx build -t test --load .
$ docker run --rm test ls -l /var | grep clamav

drwxr-xr-x 1 clamav clamav 4096 May 29 05:46 lib

Additional info

This bug seems potentially related to:


We do have these current releases where you can observe this by pulling from the registry:

Pulling the v12 image or anything newer has /var and /var/lib with ownership of clamav / 200, when that should only apply from /var/lib/clamav as per the Dockerfile.


Originally we used COPY --link until realizing the UID/GID value mapping was not reliable, and that the clamav user and group could not be used with --chown with --link, so we created the user explicitly before installing a package that would create a clamav user/group, and reference that stable UID for --chown: https://github.com/moby/buildkit/issues/2987#issuecomment-1396753289

thaJeztah commented 1 year ago

/var and /var/lib should not have their ownership changed since they already exist in the image, only /var/lib/clamav was copied over

My guess here would be that because COPY --link effectively is equivalent to;

FROM scratch
COPY ......

And here the "chown" affects the parent directories because scratch does not have those directories (and thus the COPY step won't know that the parent directories already exist in the image to which the COPY layer will be applied to), which would result in the parent directories being created, and chowned (as it's considered new content that's added as part of the COPY)

But perhaps @jedevc or @tonistiigi could back that theory

polarathene commented 1 year ago

@tonistiigi mentioned these in the past:

In the meantime, it may be a gotcha worth documenting? Existing parent paths will need an extra RUN afterwards to restore the ownership of each parent directory. I suppose that may also affect permissions (rwx+ugo) but have not verified.

polarathene commented 3 months ago

For clarity the COPY --link --chown problem:

UPDATE: Full overview documented here.


Versions info at end of comment.

TL;DR: Skip remainder of this comment, follow-up provides reproduction example where --chmod possibly works around the problem. Might help identify what is going wrong?


Observations

The docs for COPY --link / COPY --chown do not describe why that difference is observed:

Given those observations:


I suppose that may also affect permissions (rwx+ugo) but have not verified.

I have observed that if I had permissions such as 2700 for /var/lib/clamav these were modified to 0755.

However... when I added --chmod into the mix (it did not seem to matter what value was used there), the parent path segments were no longer altered in ownership or permissions! 😎

I'll provide a comment after this one with insights for this.


Reproduction versions

Still reproducible with Docker Engine 26.1.1 + buildx 0.14.0 + docker-container driver (BuildKit 0.15.2).

$ docker buildx ls

NAME/NODE       DRIVER/ENDPOINT                   STATUS    BUILDKIT
con*            docker-container
 \_ con0         \_ unix:///var/run/docker.sock   running   v0.15.2
default         docker
 \_ default      \_ default                       running   v0.13.2

NOTE: Using docker buildx create --name bk-13 --driver-opt image=moby/buildkit:v0.13.2 for docker-container driver did not change the outputs produced (via docker buildx build --builder bk-13 ...).

$ docker info

Client:
 Version:    26.1.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0-desktop.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0-desktop.2
    Path:     /usr/local/lib/docker/cli-plugins/docker-compose
#...

Server:
# ...
 Server Version: 26.1.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e377cd56a71523140ca6ae87e30244719194a521
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
 Kernel Version: 5.15.123.1-microsoft-standard-WSL2
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
polarathene commented 3 months ago

The --chmod flag potentially fixes this issue (provided the --link feature was still actually applied).

Summary:

Does --chmod prevent the --link functionality? Or is it correcting the undesirable override behaviour?


Reproduction

Dockerfile copy/paste example, with output shared below shows the different behaviours.

# syntax=docker.io/docker/dockerfile:1

FROM alpine AS base
RUN apk --no-cache add eza
CMD eza -lanhog --tree --no-time --no-filesize --no-permissions /foo

# Base permissions and ownership: `777 100:100`
FROM base AS src
WORKDIR /foo/bar/baz
RUN <<HEREDOC
  touch a b c
  chown -R 100:100 /foo
  chmod -R 777 /foo
HEREDOC

# Adjusts parent dirs permissions and ownership to: `770 200:200`
FROM base AS dest
RUN <<HEREDOC
  mkdir -p /foo/bar/baz
  chown -R 200:200 /foo
  chmod -R 770 /foo
HEREDOC

# Variant stages that may adjust permissions to `700` or ownership to `300:300`:

# Parent dirs become `755 0:0`, unexpected `--link` behaviour with `docker-container` driver:
FROM dest AS bug
COPY --link --from=src /foo/bar/baz /foo/bar/baz

# Parent dirs become `755 300:300`, but they are expected to remain as `770 200:200`:
FROM dest AS with-chown
COPY --link --chown=300 --from=src /foo/bar/baz /foo/bar/baz

# Any value with `--chmod` will prevent the `--link` + `--chown` affecting parent segments:
FROM dest AS with-chmod
COPY --link --chown=300 --chmod=700 --from=src /foo/bar/baz /foo/bar/baz

# Fix? - Any invalid value for `--chmod` will work too (without actually altering permissions):
FROM dest AS with-chmod-fix
COPY --link --chown=300 --chmod=null --from=src /foo/bar/baz /foo/bar/baz

# When COPY target does not exist, the parent segments will default to `0755 root:root`, original src parents ignored.
# However `--chown` unexpectedly modifies the parent segments ownership to `0755 300:300`, while `--chmod` does not affect parents:
FROM dest AS without-dest-dir
RUN rm -rf /foo
COPY --link --chown=300 --chmod=null --from=src /foo/bar/baz /foo/bar/baz

Optional shell script to run. Builds each stage variant, generating the output shown next:

#!/bin/env bash

function run_example() {
  local -A STAGES
  STAGES['bug']="'--link':\n(unexpectedly modifies parents ownership and permissions)"
  STAGES['with-chown']="'--link --chown=300':\n(same issue but ownership change is to '300' instead of '0')"
  STAGES['with-chmod']="'--link --chown=300 --chmod=700':\n(with '--chmod' flag the parents permissions and ownership are no longer modified, but not enforces modifying source permissions)"
  STAGES['with-chmod-fix']="'--link --chown=300 --chmod=null':\n('--chmod=null' preserves 'src' stage permissions, this is the desired outcome)"
  STAGES['without-dest-dir']="'--link --chown=300 --chmod=null':\n(when parent dirs in 'dest' stage don't exist, '--chmod=null' understandably has no influence)"

  # Iterate via deterministic order of keys:
  local -a STAGE_KEYS=(bug with-chown with-chmod with-chmod-fix without-dest-dir)
  for STAGE in "${STAGE_KEYS[@]}"; do
    local OBSERVATION="${STAGES[${STAGE}]}"
    echo -e "Stage(${STAGE}) ${OBSERVATION}"

    # NOTE: `1>/dev/null` is used to silent irrelevant stdout outputs
    docker buildx build --load --quiet --target "${STAGE}" --tag bug-copy-link . 1>/dev/null
    docker run --rm --tty bug-copy-link
    docker image rm bug-copy-link 1>/dev/null
    echo -e '\n'
  done
}

run_example

Output from script:

Stage(bug) '--link':
(unexpectedly modifies parents ownership and permissions)
Octal User Group Name
0755  0    0     /foo
0755  0    0     └── bar
0755  0    0        └── baz
0777  100  100         ├── a
0777  100  100         ├── b
0777  100  100         └── c

Stage(with-chown) '--link --chown=300':
(same issue but ownership change is to '300' instead of '0')
Octal User Group Name
0755  300  300   /foo
0755  300  300   └── bar
0755  300  300      └── baz
0777  300  300         ├── a
0777  300  300         ├── b
0777  300  300         └── c

Stage(with-chmod) '--link --chown=300 --chmod=700':
(with '--chmod' flag the parents permissions and ownership are no longer modified, but not enforces modifying source permissions)
Octal User Group Name
0770  200  200   /foo
0770  200  200   └── bar
0770  200  200      └── baz
0700  300  300         ├── a
0700  300  300         ├── b
0700  300  300         └── c

Stage(with-chmod-fix) '--link --chown=300 --chmod=null':
('--chmod=null' preserves 'src' stage permissions, this is the desired outcome)
Octal User Group Name
0770  200  200   /foo
0770  200  200   └── bar
0770  200  200      └── baz
0777  300  300         ├── a
0777  300  300         ├── b
0777  300  300         └── c

Stage(without-dest-dir) '--link --chown=300 --chmod=null':
(when parent dirs in 'dest' stage don't exist, '--chmod=null' understandably has no influence)
Octal User Group Name
0755  300  300   /foo
0755  300  300   └── bar
0755  300  300      └── baz
0777  300  300         ├── a
0777  300  300         ├── b
0777  300  300         └── c
No shell script: inputs preview + bug vs expected results ```console # The source ownership and permissions: $ docker buildx build --load --target src --tag bug-copy-link . && docker run --rm -it bug-copy-link Octal User Group Name 0777 100 100 /foo 0777 100 100 └── bar 0777 100 100 └── baz 0777 100 100 ├── a 0777 100 100 ├── b 0777 100 100 └── c # The destination ownership and permissions: $ docker buildx build --load --target dest --tag bug-copy-link . && docker run --rm -it bug-copy-link Octal User Group Name 0770 200 200 /foo 0770 200 200 └── bar 0770 200 200 └── baz # `--link --chown=300` (with or without dest dir existing, but `--chmod=null` fix only works when dest dir exists): $ docker buildx build --load --target without-dest-dir --tag bug-copy-link . && docker run --rm -it bug-copy-link Octal User Group Name 0755 300 300 /foo 0755 300 300 └── bar 0755 300 300 └── baz 0777 300 300 ├── a 0777 300 300 ├── b 0777 300 300 └── c # Expected result: $ docker buildx build --load --target with-chmod-fix --tag bug-copy-link . && docker run --rm -it bug-copy-link Octal User Group Name 0770 200 200 /foo 0770 200 200 └── bar 0770 200 200 └── baz 0777 300 300 ├── a 0777 300 300 ├── b 0777 300 300 └── c ```