Investigate containerd user home directory has the wrong owner and group IDs

clarafu commented 2 years ago

Summary

There was a user that reported they were seeing a container image that was used to run integration tests, and the user’s home directory had the wrong owner and group IDs. The container image resource pinned so nothing has changed in the container image.

More information can be found in the slack channel thread in hush-house, but it seemed to have happened after they upgraded their workers to the new 7.7 version.

Triaging info

Concourse version: 7.7.0
Browser (if applicable):
Did this used to work? Yes

cfryanr commented 2 years ago

Hi all, thanks for creating this issue for us.

I’ve created a simple pipeline which reproduces the issue. Here it is:

jobs:
  - name: build-and-run
    plan:
      - task: dockerfile
        tags: [ my-workers ]
        config:
          platform: linux
          image_resource:
            type: registry-image
            source:
              repository: debian
              tag: 11-slim
          outputs:
            - name: dockerfile
          run:
            path: bash
            args:
              - -ceux
              - |
                cat << EOF > dockerfile/Dockerfile
                FROM debian:11-slim
                RUN useradd --create-home testuser1 # create new UID, GID, and home dir
                RUN useradd --create-home testuser2 # create new UID, GID, and home dir
                RUN cat /etc/passwd | grep testuser # show new UIDs
                RUN cat /etc/group | grep testuser # show new GIDs
                RUN ls -lan /home/testuser1 # show ownership of new home dir
                RUN ls -lan /home/testuser2 # show ownership of new home dir
                EOF
      - task: build
        tags: [ my-workers ]
        privileged: true
        config:
          platform: linux
          image_resource:
            type: registry-image
            source:
              repository: concourse/oci-build-task
          inputs:
            - name: dockerfile
          outputs:
            - name: image
          run:
            path: build
          caches:
            - path: cache
          params:
            CONTEXT: dockerfile
            UNPACK_ROOTFS: true
      - task: run
        tags: [ my-workers ]
        image: image
        config:
          platform: linux
          run:
            path: bash
            args:
              - -ceux
              - |
                cat /etc/passwd | grep testuser # show UIDs
                cat /etc/group | grep testuser # show GIDs
                ls -lan /home/testuser1 # show file ownership
                ls -lan /home/testuser2 # show file ownership
                # Fail if the directory ownership does not match expected UID/GID
                if [[ "$(id testuser1 -u)" != "$(stat -c '%u' /home/testuser1)" ]]; then exit 1; fi
                if [[ "$(id testuser2 -u)" != "$(stat -c '%u' /home/testuser2)" ]]; then exit 1; fi
                if [[ "$(id testuser1 -g)" != "$(stat -c '%g' /home/testuser1)" ]]; then exit 1; fi
                if [[ "$(id testuser2 -g)" != "$(stat -c '%g' /home/testuser2)" ]]; then exit 1; fi

Note that these tasks use my team's external workers (via tags).

This job:

Always passes when the worker for the build task is randomly selected to be the same worker for the run task
Always fails when the worker for the build task is randomly selected to be a different worker for the run task

Example output from a failed job:

selected worker: my-workers-us-west1-b-28ba72c6
streaming volume for image from my-workers-us-west1-b-06a6d6b6
+ cat /etc/passwd
+ grep testuser
testuser1:x:1000:1000::/home/testuser1:/bin/sh
testuser2:x:1001:1001::/home/testuser2:/bin/sh
+ cat /etc/group
+ grep testuser
testuser1:x:1000:
testuser2:x:1001:
+ ls -lan /home/testuser1
total 12
drwxr-xr-x 1 1001 1000   54 Mar 31 19:02 .
drwxr-xr-x 1    0    0   36 Mar 31 19:02 ..
-rw-r--r-- 1 1001 1000  220 Aug  4  2021 .bash_logout
-rw-r--r-- 1 1001 1000 3526 Aug  4  2021 .bashrc
-rw-r--r-- 1 1001 1000  807 Aug  4  2021 .profile
+ ls -lan /home/testuser2
total 12
drwxr-xr-x 1 1000 1002   54 Mar 31 19:02 .
drwxr-xr-x 1    0    0   36 Mar 31 19:02 ..
-rw-r--r-- 1 1000 1002  220 Aug  4  2021 .bash_logout
-rw-r--r-- 1 1000 1002 3526 Aug  4  2021 .bashrc
-rw-r--r-- 1 1000 1002  807 Aug  4  2021 .profile
++ id testuser1 -u
++ stat -c %u /home/testuser1
+ [[ 1000 != \1\0\0\1 ]]
+ exit 1

Our workers used CONCOURSE_WORK_DIR=/mnt/disks/local-ssd where that mount point was an SSD drive device formatted using mkfs.btrfs, along with CONCOURSE_BAGGAGECLAIM_DRIVER=btrfs.

When I changed our workers to instead format that same drive using mkfs -t ext4, along with CONCOURSE_BAGGAGECLAIM_DRIVER=overlay, then the above job always passes, even when different workers are selected for the build and run tasks.

My suspicion is that Concourse v7.7.0 (or one of its dependencies) introduced some kind of bug related to streaming volumes between workers using btrfs.

Note that I did not check to see if the version of btrfs that gets installed onto our workers changed recently. We were installing btrfs tools onto the workers using apt-get update --allow-releaseinfo-change && apt-get install btrfs-progs -y. The workers are GCP VMs with debian-10-buster-v20210316 boot disks. The worker VMs get deleted and recreated on a regular basis, almost nightly.

xtremerui commented 2 years ago

Tried to reproduce this in a local docker compose env with

CONCOURSE_BAGGAGECLAIM_DRIVER: btrfs
CONCOURSE_CONTAINER_PLACEMENT_STRATEGY: fewest-build-containers  // forcing volume streaming

and two workers. While the image was built on worker 1 and the run task on worker 2, the build passed.

Thus I wonder it is related to the worker base image that used in original issue. Could you provide more details about that? For example, the gcloud image that used to build the external worker VM.

cfryanr commented 2 years ago

Hi @xtremerui, thanks for looking at this.

Our workers were GCP c2-standard-8 VMs created with a debian-10-buster-v20210316 boot disk (this is a boot disk offered by Google). We also add a Local SSD Scratch Disk at VM creation time, and after booting the VM we formatted the scratch disk as btrfs using:

apt-get update --allow-releaseinfo-change && apt-get install btrfs-progs -y
mkfs.btrfs /dev/nvme0n1
mkdir -p /mnt/disks/local-ssd
mount /dev/nvme0n1 /mnt/disks/local-ssd

Then we used it as the CONCOURSE_WORK_DIR=/mnt/disks/local-ssd with CONCOURSE_BAGGAGECLAIM_DRIVER=btrfs.

To workaround the issue, we changed this to instead be:

mkfs -t ext4 /dev/nvme0n1
mkdir -p /mnt/disks/local-ssd
mount /dev/nvme0n1 /mnt/disks/local-ssd

with CONCOURSE_WORK_DIR=/mnt/disks/local-ssd and CONCOURSE_BAGGAGECLAIM_DRIVER=overlay which made the problem go away.

concourse / concourse

Investigate containerd user home directory has the wrong owner and group IDs #8226

Summary

Triaging info