k8up-io / k8up

Kubernetes and OpenShift Backup Operator
https://k8up.io/
Apache License 2.0
658 stars 63 forks source link

Longhorn & Wordpress - k8up changes directory permission #912

Open nupplaphil opened 10 months ago

nupplaphil commented 10 months ago

Description

I'm using Longhorn with Wordpress (bitnami image) on my k8s and want to automatically backup the pvc per k8up.

After deploying Wordpress, everything works as expected, I can access /bitnami/wordpress (longhorn mountpoint) inside the pod and the blog is showing up.

Here's a ls -lh on the pod before starting the backup:

$ kubectl exec -n blog -c wordpress blog-wordpress-f8fd65b55-7762s -- ls -lh /bitnami
total 4.0K
drwxrwsr-x. 3 1001 1001 4.0K Nov 30 11:29 wordpress
$ kubectl exec -n blog -c wordpress blog-wordpress-f8fd65b55-7762s -- ls -lh /bitnami/wordpress
total 12K
-rw-rw----. 1 1001 1001 4.3K Nov 30 11:29 wp-config.php
drwxrwsr-x. 7 1001 1001 4.0K Nov 30 11:31 wp-content

After starting a Backup job, including the PVC, the Blog becomes unavailable and the /bitnami/wordpress has the same chown as before, but I cannot access it anymore:

$ kubectl exec -n blog -c wordpress blog-wordpress-f8fd65b55-7762s -- ls -lh /bitnami
total 4.0K
drwxrwsr-x. 3 1001 1001 4.0K Nov 30 11:29 wordpress
$ kubectl exec -n blog -c wordpress blog-wordpress-f8fd65b55-7762s -- ls -lh /bitnami/wordpress
ls: cannot open directory '/bitnami/wordpress': Permission denied
command terminated with exit code 2

It seems like the backup somehow changes the permission on the mountpoint, even if I cannot see it. I don't know how ...

Additional Context

No response

Logs

2023-11-30T21:36:19Z    INFO    k8up    Starting k8up…  {"version": "2.7.2", "date": "2023-10-09T10:13:29Z", "commit": "45d99dd90dbb2a080e6832c34e96b371216a3e0b", "go_os": "linux", "go_arch": "amd64", "go_version": "go1.19.13", "uid": 1001, "gid": 0}
2023-11-30T21:36:19Z    INFO    k8up.restic initializing
2023-11-30T21:36:19Z    INFO    k8up.restic setting up a signal handler
2023-11-30T21:36:19Z    INFO    k8up.restic.restic  using the following restic options  {"options": [""]}
2023-11-30T21:36:19Z    INFO    k8up.restic.restic.RepoInit.command restic command  {"path": "/usr/local/bin/restic", "args": ["init", "--option", ""]}
2023-11-30T21:36:19Z    INFO    k8up.restic.restic.RepoInit.command Defining RESTIC_PROGRESS_FPS    {"frequency": 0.016666666666666666}
2023-11-30T21:36:19Z    INFO    k8up.restic.restic.unlock   unlocking repository    {"all": false}
2023-11-30T21:36:19Z    INFO    k8up.restic.restic.unlock.command   restic command  {"path": "/usr/local/bin/restic", "args": ["unlock", "--option", ""]}
2023-11-30T21:36:19Z    INFO    k8up.restic.restic.unlock.command   Defining RESTIC_PROGRESS_FPS    {"frequency": 0.016666666666666666}
2023-11-30T21:36:20Z    INFO    k8up.restic.restic.snapshots    getting list of snapshots
2023-11-30T21:36:20Z    INFO    k8up.restic.restic.snapshots.command    restic command  {"path": "/usr/local/bin/restic", "args": ["snapshots", "--option", "", "--json"]}
2023-11-30T21:36:20Z    INFO    k8up.restic.restic.snapshots.command    Defining RESTIC_PROGRESS_FPS    {"frequency": 0.016666666666666666}
2023-11-30T21:36:21Z    INFO    k8up.restic.k8sClient   listing all pods    {"annotation": "k8up.io/backupcommand", "namespace": "blog"}
2023-11-30T21:36:21Z    INFO    k8up.restic backups of annotated jobs have finished successfully
2023-11-30T21:36:21Z    INFO    k8up.restic.restic.backup   starting backup
2023-11-30T21:36:21Z    INFO    k8up.restic.restic.backup   starting backup for folder  {"foldername": "blog-wordpress"}
2023-11-30T21:36:21Z    INFO    k8up.restic.restic.backup.command   restic command  {"path": "/usr/local/bin/restic", "args": ["backup", "--option", "", "--host", "blog", "--json", "/data/blog-wordpress"]}
2023-11-30T21:36:21Z    INFO    k8up.restic.restic.backup.command   Defining RESTIC_PROGRESS_FPS    {"frequency": 0.016666666666666666}
2023-11-30T21:36:22Z    ERROR   k8up.restic.restic.backup.progress  /data/blog-wordpress/lost+found during scan     {"error": "error occurred during backup"}
github.com/k8up-io/k8up/v2/restic/logging.(*BackupOutputParser).out
    /home/runner/work/k8up/k8up/restic/logging/logging.go:156
github.com/k8up-io/k8up/v2/restic/logging.writer.Write
    /home/runner/work/k8up/k8up/restic/logging/logging.go:103
io.copyBuffer
    /opt/hostedtoolcache/go/1.19.13/x64/src/io/io.go:429
io.Copy
    /opt/hostedtoolcache/go/1.19.13/x64/src/io/io.go:386
os/exec.(*Cmd).writerDescriptor.func1
    /opt/hostedtoolcache/go/1.19.13/x64/src/os/exec/exec.go:407
os/exec.(*Cmd).Start.func1
    /opt/hostedtoolcache/go/1.19.13/x64/src/os/exec/exec.go:544
2023-11-30T21:36:22Z    ERROR   k8up.restic.restic.backup.progress  /data/blog-wordpress/lost+found during archival     {"error": "error occurred during backup"}
github.com/k8up-io/k8up/v2/restic/logging.(*BackupOutputParser).out
    /home/runner/work/k8up/k8up/restic/logging/logging.go:156
github.com/k8up-io/k8up/v2/restic/logging.writer.Write
    /home/runner/work/k8up/k8up/restic/logging/logging.go:103
io.copyBuffer
    /opt/hostedtoolcache/go/1.19.13/x64/src/io/io.go:429
io.Copy
    /opt/hostedtoolcache/go/1.19.13/x64/src/io/io.go:386
os/exec.(*Cmd).writerDescriptor.func1
    /opt/hostedtoolcache/go/1.19.13/x64/src/os/exec/exec.go:407
os/exec.(*Cmd).Start.func1
    /opt/hostedtoolcache/go/1.19.13/x64/src/os/exec/exec.go:544
2023-11-30T21:36:23Z    INFO    k8up.restic.restic.backup.progress  backup finished {"new files": 0, "changed files": 227, "errors": 2}
2023-11-30T21:36:23Z    INFO    k8up.restic.restic.backup.progress  stats   {"time": 1.509957467, "bytes added": 121634, "bytes processed": 5386168}
2023-11-30T21:36:23Z    INFO    k8up.restic.statsHandler.promStats  sending prometheus stats    {"url": "prometheus-prometheus-pushgateway.prometheus.svc.cluster.local:9091"}
2023-11-30T21:36:23Z    INFO    k8up.restic.restic.backup.progress  restic output   {"msg": "Warning: at least one source file could not be read"}
2023-11-30T21:36:23Z    INFO    k8up.restic.restic.backup   backup finished, sending snapshot list
2023-11-30T21:36:23Z    INFO    k8up.restic.restic.snapshots    getting list of snapshots
2023-11-30T21:36:23Z    INFO    k8up.restic.restic.snapshots.command    restic command  {"path": "/usr/local/bin/restic", "args": ["snapshots", "--option", "", "--json"]}
2023-11-30T21:36:23Z    INFO    k8up.restic.restic.snapshots.command    Defining RESTIC_PROGRESS_FPS    {"frequency": 0.016666666666666666}

Expected Behavior

The Backup job should not make the mountpoint inaccessible for other pods.

Steps To Reproduce

StorageClass:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: longhorn-fast
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880" # 48 hours in minutes
  fromBackup: ""
  fsType: "ext4"
  diskSelector: "nvme"

I'm using the k8up Helm Chat with the following values.yaml:

k8up:
  podAnnotations:
    prometheus.io/scrape: "true"
    prometheus.io/path: /metrics
    prometheus.io/port: "8080"

  k8up:
    envVars:
      - name: BACKUP_PROMURL
        value: "prometheus-prometheus-pushgateway.prometheus.svc.cluster.local:9091"
      - name: BACKUP_GLOBALS3ENDPOINT
        value: "http://minio-backup:9000"
      - name: BACKUP_GLOBALACCESSKEYID
        valueFrom:
          secretKeyRef:
            name: minio-credentials
            key: username
      - name: BACKUP_GLOBALSECRETACCESSKEY
        valueFrom:
          secretKeyRef:
            name: minio-credentials
            key: password
      - name: BACKUP_GLOBALREPOPASSWORD
        valueFrom:
          secretKeyRef:
            name: minio-credentials
            key: repoPassword

this is my wordpress pvc:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    k8up.io/backup: 'true'
  labels:
    app.kubernetes.io/instance: blog
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: wordpress
    app.kubernetes.io/version: 6.4.1
    argocd.argoproj.io/instance: blog
    helm.sh/chart: wordpress-18.1.15
  name: blog-wordpress
  namespace: blog
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: longhorn-fast

and my k8up backup:

apiVersion: k8up.io/v1
kind: Backup
metadata:
  labels:
    argocd.argoproj.io/instance: blog
  name: blog-k8up-backup
  namespace: blog
spec:
  backend:
    s3:
      bucket: blog-backup
  failedJobsHistoryLimit: 2
  podSecurityContext:
    runAsUser: 1001
  successfulJobsHistoryLimit: 2

Version of K8up

v2.7.2

Version of Kubernetes

v1.28.3+k3s2

Distribution of Kubernetes

Rancher k3s

poyaz commented 7 months ago

Hello @nupplaphil

As I test k8up, the operator create pod with read-only volumes and mount to the restic backup pod. (In rwo mode k8s don't let mount volume for writing)

Could you share backup pod definition? To realize how volume mounted in backup pod.

Kidswiss commented 6 months ago

Hi @nupplaphil

To further elaborate @poyaz's answer: yes K8up only mounts backups with the read-only volumeMount, as you can see here: https://github.com/k8up-io/k8up/blob/master/operator/backupcontroller/backup_utils.go#L27

This is a read-only flag at the mount stage, so there should be no way that K8up can do any changes to the files in the volume. My guess would be, that maybe Longhorn messes up the permission during the mount or dismount. Unfortunately I've never really used it.

mhymny commented 2 months ago

Hi I just ran into the same problem. At my end it looks like this is a SELinux related problem, based on a lot of "avc: denied" errors in the journal. Maybe it's because of the labeling when mounting a volume.

Running ausearch -m avc --start recent indeed shows:

path="/data/db.sqlite3-shm" dev="rbd0" ino=16 scontext=system_u:system_r:container_t:s0:c236,c716 tcontext=system_u:object_r:container_file_t:s0:c359,c809

I think k8up is trying with SELinux level s0:c236,716 to access /data/db.sqlite3-shm which has level s0:c359,c809. Hence the deny.

Customizing the PodSecurityContext fixed the problem for me.