moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.19k stars 1.16k forks source link

Permission denied when local cache is amounted volume #2898

Open Leletir opened 2 years ago

Leletir commented 2 years ago

Hello,

First of all, thank you for your work !

Here is my setup:

Here is the StatefulSet manifests:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    chart: buildkit-0.1.7
    app: buildkit
    version: "0.10.3"
    team: devops
  name: RELEASE-NAME-buildkit
  namespace: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: buildkit
  template:
    metadata:
      annotations:
        checksum/config: 25efd9c6a2e123c65a13bc0b29babf6bd190e8a761ef9989e6b7b2b291d6b324
        container.apparmor.security.beta.kubernetes.io/buildkit: unconfined
        sidecar.istio.io/inject: "true"
    # see buildkit/docs/rootless.md for caveats of rootless mode
      labels:
        app: buildkit
    spec:
      initContainers:
        - name: init
          image: busybox:1.28
          command: ['sh', '-c', "chown 1000:1000 /tmp/buildkit"]
          volumeMounts:
            - name: RELEASE-NAME-buildkit-local-cache
              mountPath: "/tmp/buildkit"
      containers:
        - name: buildkit
          image: "jfrog.com/docker-hub-remote/moby/buildkit:v0.10.3-rootless"
          imagePullPolicy: IfNotPresent
          env:
          args:
            - --addr
            - "unix:///run/user/1000/buildkit/buildkitd.sock"
            - --addr
            - tcp://0.0.0.0:1234
            - --oci-worker-no-process-sandbox
          readinessProbe:
            exec:
              command:
                - buildctl
                - debug
                - workers
            initialDelaySeconds: 5
            failureThreshold: 5
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          livenessProbe:
            exec:
              command:
                - buildctl
                - debug
                - workers
            initialDelaySeconds: 5
            failureThreshold: 5
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          ports:
            - name: backend
              containerPort: 1234
              protocol: TCP
          resources:
            limits:
              cpu: 3
              memory: 1Gi
            requests:
              cpu: 100m
              memory: 100Mi
          volumeMounts:
            - name: config
              mountPath: "/home/user/.config/buildkit"
            - name: RELEASE-NAME-buildkit-local-cache
              mountPath: "/home/user/.local/share/buildkit"
          resources:
            limits:
              cpu: 3
              memory: 1Gi
            requests:
              cpu: 100m
              memory: 100Mi
          securityContext:
            # Needs Kubernetes >= 1.19
            seccompProfile:
              type: Unconfined
            # To change UID/GID, you need to rebuild the image
            runAsUser: 1000
            runAsGroup: 1000
      restartPolicy: Always
      serviceAccount: RELEASE-NAME-buildkit
      securityContext:
        fsGroup: 1000
      volumes:
        - name: config
          configMap:
            defaultMode: 420
            name: RELEASE-NAME-buildkit
  volumeClaimTemplates:
    - metadata:
        name:  RELEASE-NAME-buildkit-local-cache
      spec: 
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: "2Gi"
        storageClassName: vastdata-filesystem-reclaimpolicy-retain

The buildkit configuration file:

debug = false

[worker.oci]
  enabled = true
  max-parallelism = 4
  rootless = true

[worker.containerd]
  enabled = false

# Enable insecure repository for the cache only 
[registry."container-image-registry.test.svc.cluster.local:5000"]
  http = true

And the logs with the error:

#1 [internal] load build definition from ./Dockerfile
#1 transferring dockerfile: 90B 0.1s
#1 transferring dockerfile: 2.24kB 0.2s done
#1 DONE 0.2s
#2 [internal] load .dockerignore
#2 transferring context: 152B 0.2s done
#2 DONE 0.2s
#3 [internal] load metadata for artifactory/docker-hub-remote/library/centos@sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407
#3 ...
#4 [auth] docker-hub-remote/library/centos:pull library/centos:pull token for artifactory
#4 DONE 0.0s
#3 [internal] load metadata for artifactory/docker-hub-remote/library/centos@sha256:c73f515d06b0fa07bb18d8202035e739a[49](https://gitlab.com/containers/centos/-/jobs/317524#L49)4ce760aa73129f60f4bf2bd22b407
#3 DONE 1.1s
#5 [internal] load build context
#5 DONE 0.0s
#6 [downloader 1/2] FROM artifactory/docker-hub-remote/library/centos@sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407
#6 resolve artifactory/docker-hub-remote/library/centos@sha256:c73f515d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407 0.0s done
#6 DONE 0.0s
#7 importing cache manifest from chart-image-registry.test.svc.cluster.local:[50](https://gitlab.com/containers/centos/-/jobs/317524#L50)00/cache/containers/centos/centos
#7 ERROR: chart-image-registry.test.svc.cluster.local:5000/cache/containers/centos/centos:latest: not found
#6 [downloader 1/2] FROM artifactory/docker-hub-remote/library/centos@sha256:c73f[51](https://gitlab.com/containers/centos/-/jobs/317524#L51)5d06b0fa07bb18d8202035e739a494ce760aa73129f60f4bf2bd22b407
#6 CACHED
#5 [internal] load build context
#5 transferring context: 6.95kB 0.4s done
#5 DONE 0.4s
#8 [stage-1 2/4] COPY res/ /
#8 ERROR: mount callback failed on /run/user/1000/containerd-mount11370222[54](https://gitlab.com/containers/centos/-/jobs/317524#L54): open /run/user/1000/containerd-mount1137022254/root/.bash_logout: permission denied
#9 [downloader 2/2] RUN set -euo pipefail  && curl -sSL https://jfrog.com/artifactory/github-remote/krallin/tini/releases/download/v0.19.0/tini-amd64 -o /tmp/tini-amd64  && curl -sSL https://jfrog.com/artifactory/github-remote/krallin/tini/releases/download/v0.19.0/tini-amd64.sha2[56](https://gitlab.com/containers/centos/-/jobs/317524#L56)sum -o /tmp/tini-amd[64](https://gitlab.com/containers/centos/-/jobs/317524#L64).sha256sum  && cd /tmp  && echo "$(cat tini-amd64.sha256sum)" | sha256sum -c
#9 ERROR: mount callback failed on /run/user/1000/containerd-mount1137022254: open /run/user/1000/containerd-mount1137022254/root/.bash_logout: permission denied
------
 > importing cache manifest from chart-image-registry.test.svc.cluster.local:5000/cache/containers/centos/centos:
------
------
 > [stage-1 2/4] COPY res/ /:
------
------
 > [downloader 2/2] RUN set -euo pipefail  && curl -sSL https://jfrog.com/artifactory/github-remote/krallin/tini/releases/download/v0.19.0/tini-amd64 -o /tmp/tini-amd64  && curl -sSL https://jfrog.com/artifactory/github-remote/krallin/tini/releases/download/v0.19.0/tini-amd64.sha256sum -o /tmp/tini-amd64.sha256sum  && cd /tmp  && echo "$(cat tini-amd64.sha256sum)" | sha256sum -c:
------
./Dockerfile:27
--------------------
  25 |         TECH_GROUP=g_tech
  26 |     
  27 | >>> COPY res/ /
  28 |     COPY --from=downloader /tmp/tini-amd64 /usr/bin/tini
  29 |     
--------------------
error: failed to solve: failed to compute cache key: mount callback failed on /run/user/1000/containerd-mount113[70](https://gitlab.com/containers/centos/-/jobs/317524#L70)22254: open /run/user/1000/containerd-mount1137022254/root/.bash_logout: permission denied

The content of the "res/" directory is the following:

├── etc                                                       
│   ├── mail                                                  
│   │   └── relay-domains                                     
│   ├── pki                                                   
│   │   └── tls                                               
│   │       └── certs                                         
│   │           └── ca.crt                        
│   └── skel                                                  
│       └── .bashrc                                           
└── tmp                                                       
    └── yum.repos.d                                           
        ├── centos7.3.repo                                    
        ├── centos7.3_updates.repo                            
        ├── centos7.9.repo                                    
        ├── centos-7.9-updates.repo                           
        ├── centos7-devtools.repo                             
        ├── epel-7-20180222.repo                              
        ├── epel7.repo                                        
        ├── intel-mkl.repo                                    
        ├── nprod-rpm-7-archive.repo                          
        ├── nprod-rpm-7.repo                                  
        ├── postgresql-rpm-7.repo                             
        └── sqpc-centos7.repo

When I don't mount a volume in this directory, everything works fine.

In case you're wondering the '$HOME/.local/share/buildkit' as the following rights: drwxrwsrwx 2 user user 4096 Jun 6 15:49 buildkit

Do you have any idea of what could be wrong in my configuration ? Thanks in advance.

weixiao-huang commented 1 year ago

Any update of this issue?

renhao-0518 commented 1 year ago

+1 So buildKit does not support NFS?

NiklasRosenstein commented 1 year ago

I'm running into a similar issue when running buildkitd directly in rootless mode:

Buildkitd output:

buildkit@buildkit-test:~$ PATH=$PATH:$PWD/bin ./rootlesskit buildkitd --addr tcp://0.0.0.0:1234 --rootless
INFO[2023-08-31T10:41:07Z] auto snapshotter: using overlayfs
INFO[2023-08-31T10:41:07Z] found worker "gy34t5pvh6olao3ltsvw81zfh", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:buildkit-test org.mobyproject.buildkit.worker.network:host org.mobyproject.buildkit.worker.oci.process-mode:sandbox org.mobyproject.buildkit.worker.selinux.enabled:false org.mobyproject.buildkit.worker.snapshotter:overlayfs], platforms=[linux/amd64 linux/amd64/v2 linux/amd64/v3 linux/386]
WARN[2023-08-31T10:41:07Z] skipping containerd worker, as "/run/containerd/containerd.sock" does not exist
INFO[2023-08-31T10:41:07Z] found 1 workers, default="gy34t5pvh6olao3ltsvw81zfh"
WARN[2023-08-31T10:41:07Z] currently, only the default worker can be used.
WARN[2023-08-31T10:41:07Z] TLS is not enabled for tcp://0.0.0.0:1234. enabling mutual TLS authentication is highly recommended
INFO[2023-08-31T10:41:07Z] running server on [::]:1234
ERRO[2023-08-31T10:41:12Z] /moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed to compute cache key: failed to create temp dir: mkdir /run/user/0/containerd-mount1085290079: permission denied

Docker buildx output:

WARNING: No output specified with remote driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
Dockerfile:2
--------------------
   1 |     FROM ubuntu:latest
   2 | >>> RUN apt-get update && apt-get install sysbench -y && sysbench cpu run
   3 |
--------------------
ERROR: failed to solve: failed to compute cache key: failed to create temp dir: mkdir /run/user/0/containerd-mount1085290079: permission denied
dierbei commented 4 months ago

@NiklasRosenstein @renhao-0518 @weixiao-huang @Leletir I tried to use initcontainer to solve this problem

The way to do this is the following.

      initContainers: 
        - name: prepare
          image: alpine:3.10
          command: 
           - sh
           - -c
            - "chmod 777 /mnt/buildkit"