moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.21k stars 1.16k forks source link

Disk usage not correct reported preventing GC trigger #5459

Open sthroner opened 2 weeks ago

sthroner commented 2 weeks ago

Hello,

we are running Buildkit rootless in a Kubernetes installation and have defined a GC policy with keepBytes:

[[worker.oci.gcpolicy]]
    all = true
    keepBytes = "250GB"  # 50GB less than the PVC size for /home/user/.local/share/buildkit

But the Rule is not always triggered when we hit the limit. We tried to pin down the issue already and here are all the details we already found out.

GC Triggered based on Disk Usage

Most of the time, the GC is working fine and removes the cached data above the set limit, but from the time a buildkit instance, is running out of storage and responds with the following error:

error: failed to solve: ResourceExhausted: failed to prepare k4ovv028ht6dewfcgpus32fn7 as q40z7n0str2xd0ec1u7mjz1r7: copying of parent failed: failed to copy files: write /home/user/.local/share/buildkit/runc-native/snapshots/snapshots/new-19411978/usr/lib/x86_64-linux-gnu/libperl.so.5.36.0: copy_file_range: no space left on device time to time we run into the issue that the GC is not triggered and the buildkit instance is running out of storage

After some tests it looked like the buildctl disk usage command (buildctl du) did not report the correct amount for the actual disk usage (du). Since the buildctl reported disk usage was lower then the keepByte value set in the GC policy the GC was not triggered.

Disk Usage Reported by Buildkit based on type

Record Type,Size
source.local,27.36 MiB
regular,110.78 GiB
Total,110.80 GiB

Disk Usage System

291.7G  /home/user/.local/share/buildkit/runc-native
291.7G  /home/user/.local/share/buildkit/

When running the GC manually via buildctl prune the GC does cleanup all the space. Therefore the GC collector is working fine but it looks more like an issue with the measurement of the disk usage.

Wrong Permission within Cache Folder

What we also noticed during the analysis was that the permissions for some folders within the cache were not set as we would expect them to be.

running du does not work due to permission

du: can't open '/home/user/.local/share/buildkit/runc-native/snapshots/snapshots/1762/var/cache/apt/archives/partial': Permission denied

permission for the folder

~/.local/share/buildkit/runc-native/snapshots/snapshots/1762/var/cache/apt/archives $ ls -la
total 12
drwxr-xr-x    3 user     user          4096 Oct  8 13:21 .
drwxr-xr-x    3 user     user          4096 Oct  8 13:21 ..
-rw-r-----    1 user     user             0 Aug 13 00:43 lock
drwx------    2 100041   user          4096 Aug 13 00:43 partial

We also see other folder with similar permissions as the var/cache/apt/archives/partial, so this looks like not only something related to apt package manager.

Setup

We currently use the version 0.16 of the rootless container (https://hub.docker.com/layers/moby/buildkit/v0.16.0-rootless) in a K8s setup.

StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: buildkit-temp
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: buildkit-temp
  serviceName: buildkit-temp
  template:
    metadata:
      labels:
        app.kubernetes.io/name: buildkit-temp
    spec:
      containers:
        - args:
            - --config
            - /var/config/buildkit.toml
          image: moby/buildkit:rootless
          name: buildkit
          ports:
            - containerPort: 1234
              protocol: TCP
          securityContext:
            allowPrivilegeEscalation: true
            capabilities:
              add:
              - CHOWN
              - DAC_OVERRIDE
              - FOWNER
              - FSETID
              - SETGID
              - SETUID
              - SETFCAP
              drop:
              - ALL
            privileged: false
            runAsGroup: 1000
            runAsNonRoot: true
            runAsUser: 1000
            seccompProfile:
              type: Unconfined
          volumeMounts:
            - mountPath: /home/user/.local/share/buildkit
              name: buildkit
            - mountPath: /var/config
              name: config
      securityContext:
        fsGroup: 1000
      ## Include the Service Account in the deployment
      serviceAccount: gitlab-runner-master-buildkit
      serviceAccountName: gitlab-runner-master-buildkit
      volumes:
        - name: config
          configMap:
            name: buildkit-temp
  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate
  volumeClaimTemplates:
    - apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        name: buildkit
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 300Gi
        volumeMode: Filesystem

ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: buildkit-temp
data:
  buildkit.toml: |
    debug = true

    [grpc]
      address = [
        "tcp://0.0.0.0:1234", # tcp is for buildctl connections
        "unix:///run/user/1000/buildkit/buildkitd.sock", # non-root socket when running as non-root
      ]

    [worker.containerd]
      enabled = false

    [worker.oci]
      enabled = true
      # Enable automatic garbage collection, runs every minute
      gc = true
      # Allow running in main pid namespace when privileged: false
      noProcessSandbox = true

      [[worker.oci.gcpolicy]]
        all = true
        keepBytes = "250GB"  # 50GB less than the PVC size for /home/user/.local/share/buildkit
devthejo commented 5 days ago

Seem to have a similar problem here, this is my full config (in case I'm missing something):

debug = true
root = "/var/lib/buildkit"

commands

[history]
  maxAge = 172800
  maxEntries = 50

[worker]

[worker.containerd]
  enabled = false

[worker.oci]
  enabled = true

  rootless = true

  max-parallelism = 4

  gc = true

  snapshotter = auto

  platforms = ["linux/amd64"]

  [[worker.oci.gcpolicy]]
    filters = ["type==exec.cachemount"]
    keepBytes = 90%
    keepDuration = 30d

  [[worker.oci.gcpolicy]]
    all = true
    keepBytes = 90%
    keepDuration = 30d

But my volume mounted on /home/user/.local/share/buildkit and used only by buildkit is full at 96%, causing a no space left on disk error when trying to run a build task

EDIT: Another observation: after restarted the pod (and resized volume), seem that the cleanup was performed

jedevc commented 4 days ago

@devthejo what version of buildkit do you see this on? The original issue seems to be on v0.16, with rootless mode, is that the same setup you have?

devthejo commented 4 days ago

@devthejo what version of buildkit do you see this on? The original issue seems to be on v0.16, with rootless mode, is that the same setup you have?

@jedevc It was v0.13.0, I upgraded now to v0.17.1 and I'm waiting to see if it's reproducible on the new version (I was in the need to fix the bug quickly and didn't have enough time to investigate further). Not sure it's the same issue, but it looked like.