k8up-io / wrestic

Restic Backup Kubernetes and OpenShift Wrapper - Part of K8up
BSD 3-Clause "New" or "Revised" License
16 stars 3 forks source link

[Bug] cluster backup on GKE clusters fails with wrestic > 0.1.9 #95

Closed TheBigLee closed 3 years ago

TheBigLee commented 3 years ago

Describe the bug

On GCP clusters, the cluster backup is not working at all when using a wrestic version > 0.1.9. When using wrestic > 0.1.9 the objects-backup pod hangs indefinitely and is never terminated. This leads to a couple of dozens backup pods runing after a while, if not manually cleaned up. Furthermore the affected clusters, have no cluster backup at all.

Additional context

Not really sure if this is a wrestic or a restic issue.

Logs

If applicable, add logs to help explain your problem.

I0701 14:03:31.273296       1 main.go:43] wrestic "level"=0 "msg"="Wrestic Version: v0.3.1-dirty"  
I0701 14:03:31.274101       1 main.go:44] wrestic "level"=0 "msg"="Operator Build Date: Thu May 27 13:37:01 UTC 2021"  
I0701 14:03:31.274161       1 main.go:45] wrestic "level"=0 "msg"="Go Version: go1.16.4"  
I0701 14:03:31.274193       1 main.go:46] wrestic "level"=0 "msg"="Go OS/Arch: linux/amd64"  
I0701 14:03:31.275510       1 main.go:208] wrestic "level"=0 "msg"="setting up a signal handler"  
I0701 14:03:31.275735       1 command.go:57] wrestic/RepoInit/command "level"=0 "msg"="restic command"  "args"=["init"] "path"="/usr/local/bin/restic"
I0701 14:03:31.275836       1 command.go:83] wrestic/RepoInit/command "level"=0 "msg"="Defining RESTIC_PROGRESS_FPS"  "frequency"=0.016666666666666666
I0701 14:03:32.410114       1 snapshots.go:39] wrestic/snapshots "level"=0 "msg"="getting list of snapshots"  
I0701 14:03:32.410240       1 command.go:57] wrestic/snapshots/command "level"=0 "msg"="restic command"  "args"=["snapshots","--json"] "path"="/usr/local/bin/restic"
I0701 14:03:32.410298       1 command.go:83] wrestic/snapshots/command "level"=0 "msg"="Defining RESTIC_PROGRESS_FPS"  "frequency"=0.016666666666666666
I0701 14:03:39.139106       1 pod_list.go:50] wrestic/k8sClient "level"=0 "msg"="listing all pods"  "annotation"="k8up.syn.tools/backupcommand" "namespace"="syn-cluster-backup"
I0701 14:03:39.159207       1 pod_list.go:80] wrestic/k8sClient "level"=0 "msg"="adding to backup list"  "namespace"="syn-cluster-backup" "pod"="object-dumper-6554b46655-v4n6r"
I0701 14:03:39.160492       1 pod_exec.go:45] wrestic/k8sExec "level"=0 "msg"="executing command"  "command"="/usr/local/bin/dump-objects, -sd, /data" "namespace"="syn-cluster-backup" "pod"="object-dumper-6554b46655-v4n6r"
I0701 14:03:39.160694       1 stdinbackup.go:16] wrestic/stdinBackup "level"=0 "msg"="starting stdin backup"  "extension"=".tar.gz" "filename"="/syn-cluster-backup-object-dumper"
I0701 14:03:39.160726       1 command.go:57] wrestic/stdinBackup/command "level"=0 "msg"="restic command"  "args"=["backup","--host","syn-cluster-backup","--json","--stdin","--stdin-filename","/syn-cluster-backup-object-dumper.tar.gz"] "path"="/usr/local/bin/restic"
I0701 14:03:39.160799       1 command.go:83] wrestic/stdinBackup/command "level"=0 "msg"="Defining RESTIC_PROGRESS_FPS"  "frequency"=0.016666666666666666
I0701 14:03:42.526752       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 396.557122ms: Truncated response should have continuation token set"
I0701 14:03:43.567689       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 597.811922ms: Truncated response should have continuation token set"
I0701 14:03:44.852300       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 1.409144665s: Truncated response should have continuation token set"
E0701 14:03:46.764007       1 logging.go:122] wrestic/object-dumper-6554b46655-v4n6r "msg"="Error from server (NotFound): Unable to list \"/v1, Resource=bindings\": the server could not find the requested resource" "error"="error during command"  
I0701 14:03:47.194527       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 1.192358242s: Truncated response should have continuation token set"
I0701 14:03:49.158491       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 3.456004252s: Truncated response should have continuation token set"
I0701 14:03:53.234635       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 4.543793083s: Truncated response should have continuation token set"
I0701 14:03:58.754085       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 5.830976587s: Truncated response should have continuation token set"
E0701 14:03:59.115617       1 logging.go:122] wrestic/object-dumper-6554b46655-v4n6r "msg"="Error from server (NotFound): Unable to list \"authorization.k8s.io/v1, Resource=localsubjectaccessreviews\": the server could not find the requested resource" "error"="error during command"  
I0701 14:04:05.440278       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 4.513276731s: Truncated response should have continuation token set"
E0701 14:04:07.779741       1 logging.go:122] wrestic/object-dumper-6554b46655-v4n6r "msg"="Error from server (MethodNotAllowed): the server does not allow this method on the requested resource" "error"="error during command"  
E0701 14:04:07.880705       1 logging.go:122] wrestic/object-dumper-6554b46655-v4n6r "msg"="Error from server (MethodNotAllowed): the server does not allow this method on the requested resource" "error"="error during command"  
E0701 14:04:08.902086       1 logging.go:122] wrestic/object-dumper-6554b46655-v4n6r "msg"="Error from server (MethodNotAllowed): the server does not allow this method on the requested resource" "error"="error during command"  
E0701 14:04:09.115751       1 logging.go:122] wrestic/object-dumper-6554b46655-v4n6r "msg"="Error from server (MethodNotAllowed): the server does not allow this method on the requested resource" "error"="error during command"  
I0701 14:04:10.782101       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 8.436116856s: Truncated response should have continuation token set"
I0701 14:04:19.838295       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="List(index) returned error, retrying after 21.283270947s: Truncated response should have continuation token set"
I0701 14:04:42.340897       1 logging.go:150] wrestic/stdinBackup/progress "level"=0 "msg"="restic output"  "msg"="Fatal: Truncated response should have continuation token set"

Expected behavior

Wrestic should be able to backup cluster objects on GKE clusters.

To Reproduce

Steps to reproduce the behavior:

  1. Setup a GKE cluster
  2. Configure k8up and cluster-backup with latest version of k8up and wrestic
  3. Configure GKE S3 storage
  4. Observe the failing cluster-backups

Environment (please complete the following information):

cimnine commented 3 years ago

From this bug report on restic it looks like GKE S3 does not support the S3 V2 API yet. Could you try to set BACKUP_RESTIC_OPTIONS to s3.list-objects-v1=true in k8up? This option should then be passed from K8up to Wrestic, and from Wrestic to Restic.

TheBigLee commented 3 years ago

Thanks for the hint.

Works like a charm now.