docker / buildx

Docker CLI plugin for extended build capabilities with BuildKit
Apache License 2.0
3.57k stars 481 forks source link

buildx try to copy xattrs (Extended file attributes) and fail #584

Closed LuckyTurtleDev closed 2 years ago

LuckyTurtleDev commented 3 years ago

I use buildx to build multiplatform docker image in the gitlab-ci. But the ci fail while building docker image, because it try to copy xattrs and fail to do this:

 > [linux/arm/v7 2/4] RUN set -xe     && apk add --no-cache ca-certificates                           ffmpeg                           openssl                           aria2     && pip3 install youtube-dl:
------
Dockerfile:8
--------------------
   7 |     
   8 | >>> RUN set -xe \
   9 | >>>     && apk add --no-cache ca-certificates \
  10 | >>>                           ffmpeg \
  11 | >>>                           openssl \
  12 | >>>                           aria2 \
  13 | >>>     && pip3 install youtube-dl
  14 |     
--------------------
error: failed to solve: rpc error: code = Unknown desc = executor failed running [/dev/.buildkit_qemu_emulator /bin/sh -c set -xe     && apk add --no-cache ca-certificates                           ffmpeg                           openssl                           aria2     && pip3 install youtube-dl]: failed to copy xattrs: failed to set xattr "security.selinux" on /tmp/buildkit-qemu-emulator538849571/dev/.buildkit_qemu_emulator: operation not supported

https://gitlab.com/Lukas1818/docker-youtube-dl-cron/-/jobs/1160602917 I am using the following ci:

variables:
  DOCKER_DRIVER: overlay2
  DOCKER_HOST: tcp://docker:2375/

docker-build:
  # Use the docker image with buildx for multiplatform build.
  image: lukas1818/docker-with-buildx:latest
  stage: build
  services:
    - docker:dind
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
  # Default branch leaves tag empty (= latest tag)
  # All other branches are tagged with the escaped branch name (commit ref slug)
  script:
    - |
      if [[ "$CI_COMMIT_BRANCH" == "$CI_DEFAULT_BRANCH" ]]; then
        tag=""
        echo "Running on default branch '$CI_DEFAULT_BRANCH': tag = 'latest'"
      else
        tag=":$CI_COMMIT_REF_SLUG"
        echo "Running on branch '$CI_COMMIT_BRANCH': tag = $tag"
      fi
    - docker buildx create --use
    - docker buildx build --push --platform linux/arm/v7,linux/arm64/v8,linux/amd64 --tag "$CI_REGISTRY_IMAGE${tag}" .
  # Run this job in a branch where a Dockerfile exists
  rules:
    - if: $CI_COMMIT_BRANCH
      exists:
        - Dockerfile

https://gitlab.com/Lukas1818/docker-youtube-dl-cron/-/blob/a0875f176c79841ac702d10db018cc803b615743/.gitlab-ci.yml

This also happen in an other repo:

> [linux/arm64 2/3] RUN apt-get update && apt-get install -y    ca-certificates     stubby && rm -rf /var/lib/apt/lists/*:
------
Dockerfile:4
--------------------
   3 |     
   4 | >>> RUN apt-get update && apt-get install -y \
   5 | >>>  ca-certificates \
   6 | >>>  stubby\
   7 | >>>  && rm -rf /var/lib/apt/lists/*
   8 |     
--------------------
error: failed to solve: rpc error: code = Unknown desc = executor failed running [/dev/.buildkit_qemu_emulator /bin/sh -c apt-get update && apt-get install -y  ca-certificates     stubby && rm -rf /var/lib/apt/lists/*]: failed to copy xattrs: failed to set xattr "security.selinux" on /tmp/buildkit-qemu-emulator140786208/dev/.buildkit_qemu_emulator: operation not supported

https://gitlab.com/Lukas1818/docker-stubby/-/jobs/1157747049

Exist an option to ignore xattrs?

floscher commented 3 years ago

At https://gitlab.com/gokaart/docker-library/-/jobs/1199028080#L349 we hit the same error:

------
 > [linux/arm64 2/3] RUN mkdir ~/.gradle/ && echo "org.gradle.daemon=false" >> ~/.gradle/gradle.properties:
------
Dockerfile:4
--------------------
   2 |     MAINTAINER floscher
   3 |     
   4 | >>> RUN mkdir ~/.gradle/ && echo "org.gradle.daemon=false" >> ~/.gradle/gradle.properties
   5 |     
   6 |     RUN apt-get update -yq && apt-get install -yq git openssh-client
--------------------
error: failed to solve: rpc error: code = Unknown desc = executor failed running [/dev/.buildkit_qemu_emulator /bin/sh -c mkdir ~/.gradle/ && echo "org.gradle.daemon=false" >> ~/.gradle/gradle.properties]: failed to copy xattrs: failed to set xattr "security.selinux" on /tmp/buildkit-qemu-emulator616775368/dev/.buildkit_qemu_emulator: operation not supported
ChriZ982 commented 3 years ago

I was able to fix this issue by calling docker run --rm --privileged multiarch/qemu-user-static --reset -p yes before docker buildx create --use.

However, sometimes the job fails for me and I have to restart it. Might have something to do with two concurrent jobs. Didn't figure it out yet...

Solution found here.

LuckyTurtleDev commented 3 years ago

I was able to fix this issue by calling docker run --rm --privileged multiarch/qemu-user-static --reset -p yes before docker buildx create --use.

However, sometimes the job fails for me and I have to restart it. Might have something to do with two concurrent jobs. Didn't figure it out yet...

Solution found here.

If I use docker run --rm --privileged multiarch/qemu-user-static --reset -p yes in the same step as docker buildx create --use it does work.

Thanks

Does work:

  script:
    - |
      if [[ "$CI_COMMIT_BRANCH" == "$CI_DEFAULT_BRANCH" ]]; then
        tag=""
        echo "Running on default branch '$CI_DEFAULT_BRANCH': tag = 'latest'"
      else
        tag=":$CI_COMMIT_REF_SLUG"
        echo "Running on branch '$CI_COMMIT_BRANCH': tag = $tag"
      fi
    - docker run --rm --privileged multiarch/qemu-user-static --reset -p yes; docker buildx create --use
    - docker buildx build --push --platform linux/arm/v7,linux/arm64/v8,linux/amd64 --tag "$CI_REGISTRY_IMAGE${tag}" .

Does not work:

  script:
    - |
      if [[ "$CI_COMMIT_BRANCH" == "$CI_DEFAULT_BRANCH" ]]; then
        tag=""
        echo "Running on default branch '$CI_DEFAULT_BRANCH': tag = 'latest'"
      else
        tag=":$CI_COMMIT_REF_SLUG"
        echo "Running on branch '$CI_COMMIT_BRANCH': tag = $tag"
      fi
    - docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
    - docker buildx create --use
    - docker buildx build --push --platform linux/arm/v7,linux/arm64/v8,linux/amd64 --tag "$CI_REGISTRY_IMAGE${tag}" .

Because sometimes the build does still fail, I would recommend to add retry: 2 to your .gitlab-ci.yml. Edit: The builds does not fail because of this, for along time anymore. So maybe the retry is not needed anymore.

mickaelperrin commented 3 years ago

Hi,

This tip helped me to solve the issue while building my multi arch Nginx image. However, I still have the issue when trying to build my multi-step multi-arch PHP image.

The randomness of the issue prevent my build to complete because it makes fail EVERY TIME one the php version I build concurrently.

Excerpt of my .gitlab-ci.yml file

image: registry.gitlab.com/dkod-docker/docker-in-docker:latest
services:
  - docker:dind

stages:
  - base
  - release_base
  - dev
  - release_dev
  - wordpress
  - release_wordpress
  - drupal
  - release_drupal

variables:
  DOCKER_HOST: tcp://docker:2375
  DOCKER_DRIVER: overlay2
  CONTAINER_TEST_IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
  CONTAINER_RELEASE_IMAGE: $CI_REGISTRY_IMAGE:latest

before_script:
  - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
  - export DOCKER_CLI_EXPERIMENTAL=enabled
  - docker version
  - docker buildx version
  - uname -a
  - docker info
  - ls -al /usr/lib/docker/cli-plugins
  - docker buildx version
  - docker buildx create --help

build_php_production_56_fpm:
  stage: base
  script:
    - ./build.sh php production 5.6 fpm

build_php_production_70_fpm:
  stage: base
  script:
    - ./build.sh php production 7.0 fpm

build_php_production_71_fpm:
  stage: base
  script:
    - ./build.sh php production 7.1 fpm

build_php_production_72_fpm:
  stage: base
  script:
    - ./build.sh php production 7.2 fpm

build_php_production_73_fpm:
  stage: base
  script:
    - ./build.sh php production 7.3 fpm

build_php_production_74_fpm:
  stage: base
  script:
    - ./build.sh php production 7.4 fpm

build_php_production_80_fpm:
  stage: base
  script:
    - ./build.sh php production 8.0 fpm

release_php_production:
  stage: release_base
  script:
    - ACTION=release ./build.sh php production
  only:
    - master

build_php_development_56_fpm:
  stage: dev
  script:
    - ./build.sh php development 5.6 fpm

build_php_development_70_fpm:
  stage: dev
  script:
    - ./build.sh php development 7.0 fpm

Excerpt of my build script:

  export DOCKER_CLI_EXPERIMENTAL=enabled
  docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
  docker buildx create --use
  docker pull --platform "${platform}" ${REGISTRY}/${IMAGE}:${techno}-${env}-${version}-${typefinal} || true
  docker buildx build \
    $QUIET \
    --platform linux/arm64,linux/amd64 \
    --push \
    --cache-from ${REGISTRY}/${IMAGE}:${techno}-${env}-${version}-${typefinal} \
    --build-arg REGISTRY=${REGISTRY} \
    --build-arg BASE_IMAGE=${IMAGE} \
    --build-arg BUILDKIT_INLINE_CACHE=1 \
    -t ${REGISTRY}/${IMAGE}:${techno}-${env}-${version}-${typefinal}${CI_COMMIT_SHORT}

Any idea ?

ChriZ982 commented 3 years ago

@mickaelperrin maybe you could try to split the build jobs up into different stages. Such that you only have one job per stage. Of course, that is not a feasible solution but it might help in debugging the issue. I encounter this issue randomly with two jobs in a stage.

mickaelperrin commented 3 years ago

As you may have spotted, my build is pretty big: 7 versions of PHP x 8 stages x 2 platforms.

The error I encounter randomly happens, sometimes when building the 5.6 version, some times for 7.0....

It fails at the first RUN command of the dockerfile which is a simple apt install.

error: failed to solve: rpc error: code = Unknown desc = executor failed running [/dev/.buildkit_qemu_emulator /bin/sh -c apt-get update && apt-get install --no-install-recommends -y locales && sed -i -e "s/# ${LANG} UTF-8/${LANG} UTF-8/" /etc/locale.gen && echo "LANG=\"${LANG}\"" > /etc/default/locale && dpkg-reconfigure -f noninteractive locales && /usr/sbin/update-locale LANG=${LANG} && apt-get clean && rm -r /var/lib/apt/lists/*]: failed to copy xattrs: failed to set xattr "security.selinux" on /tmp/buildkit-qemu-emulator447401266/dev/.buildkit_qemu_emulator: operation not supported

But I also have another issue with that pipeline. As the QEMU ARM emulation is too slow and the build takes more than 1 hour, I need to find another way to have a multi-arch build, I think by using directly multi arch hosts to build the image and merge them in the repository.

crazy-max commented 2 years ago

@Lukas1818 we have added some e2e tests on binfmt repo https://github.com/tonistiigi/binfmt/pull/81 using your repro and it looks ok with latest release: https://github.com/tonistiigi/binfmt/runs/5235540518?check_suite_focus=true#step:5:121