vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.63k stars 1.39k forks source link

error getting backup volume info: DownloadRequest.velero.io - PartiallyFailed #7677

Open dove-young opened 5 months ago

dove-young commented 5 months ago

What steps did you take and what happened:

I installed velero according to this doc https://velero.io/docs/v1.13/file-system-backup/#how-velero-integrates-with-restic

oc create ns velero
oc annotate namespace velero openshift.io/node-selector=""
oc adm policy add-scc-to-user privileged -z velero -n velero

velero install \
    --provider minio \
    --features=EnableCSI \
    --plugins velero/velero-plugin-for-aws:v1.2.1 \
    --bucket $BUCKET \
    --backup-location-config region=$REGION \
    --secret-file $CREDENTIAL \
    --uploader-type restic \
    --use-node-agent --privileged-node-agent \
    --use-volume-snapshots=false \
    --backup-location-config region=$REGION,s3ForcePathStyle="true",s3Url=$MINIO_URL

Then created backup-locations

velero backup-location create bsl-o1-726938 \
  --access-mode ReadWrite \
  --provider aws \
  --bucket $BUCKET \
  --config region=$REGION,s3ForcePathStyle="true",s3Url=$MINIO_URL

Then I created a backup with volumes

velero backup create instana-backup-vol-2 \
    --storage-location bsl-o1-726938 --wait \
    --include-namespaces instana-agent,instana-cassandra,instana-clickhouse,instana-core,instana-elastic,instana-kafka,instana-operator,instana-postgres,instana-units,instana-zookeeper \
    --include-cluster-resources=false \
    --default-volumes-to-fs-backup 

Then I checked. backup, I found error

velero backup describe instana-backup-vol-2

error looks like this

Name:         instana-backup-vol-2
Namespace:    velero
Labels:       velero.io/storage-location=bsl-o1-726938
Annotations:  velero.io/resource-timeout=10m0s
              velero.io/source-cluster-k8s-gitversion=v1.28.7+6e2789b
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=28

Phase:  PartiallyFailed (run `velero backup logs instana-backup-vol-2` for more information)

Warnings:
  Velero:     <none>
  Cluster:    <none>
  Namespaces:
    instana-cassandra:   resource: /pods name: /instana-cassandra-default-sts-0 message: /volume server-config-base is declared in pod instana-cassandra/instana-cassandra-default-sts-0 but
not mounted by any container, skipping
                         resource: /pods name: /instana-cassandra-default-sts-1 message: /volume server-config-base is declared in pod instana-cassandra/instana-cassandra-default-sts-1 but
not mounted by any container, skipping
                         resource: /pods name: /instana-cassandra-default-sts-2 message: /volume server-config-base is declared in pod instana-cassandra/instana-cassandra-default-sts-2 but
not mounted by any container, skipping
    instana-kafka:       resource: /pods name: /instana-entity-operator-64594b464d-c6v6w message: /volume strimzi-tls-sidecar-tmp is declared in pod instana-kafka/instana-entity-operator-64
594b464d-c6v6w but not mounted by any container, skipping

Errors:
  Velero:   name: /chi-instana-local-0-1-0 message: /Error backing up item error: /pod volume backup failed: data path backup failed: error running restic backup command restic backup --repo=s3:http://9.112.252.135:9000/veleco-o1-726938/restic/instana-clickhouse --password-file=/tmp/credentials/velero/velero-repo-credentials-repository-password --cache-dir=/scratch/.cache/restic . --tag=backup=instana-backup-vol-2 --tag=backup-uid=b8e93ccc-dbe0-49e1-a784-6751ae753dea --tag=ns=instana-clickhouse --tag=pod=chi-instana-local-0-1-0 --tag=pod-uid=ae209b30-7d8f-40a5-b48a-50047e58a3a1 --tag=pvc-uid=dd5296ab-6ec1-49b2-89f9-1a41ad7a3bdb --tag=volume=instana-clickhouse-data-volume --host=velero --json with error: exit status 3 stderr: {"message_type":"error","error":{"Op":"lstat","Path":"store/34e/34e02f21-a323-4df1-b2dd-23d1d87edf10/202404_49408_49408_0","Err":2},"during":"archival","item":"/host_pods/ae209b30-7d8f-40a5-b48a-50047e58a3a1/volumes/kubernetes.io~csi/pvc-dd5296ab-6ec1-49b2-89f9-1a41ad7a3bdb/mount/store/34e/34e02f21-a323-4df1-b2dd-23d1d87edf10/202404_49408_49408_0"}
Warning: at least one source file could not be read

  Cluster:    <none>
  Namespaces: <none>

Namespaces:
  Included:  instana-agent, instana-cassandra, instana-clickhouse, instana-core, instana-elastic, instana-kafka, instana-operator, instana-postgres, instana-units, instana-zookeeper
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  excluded

Label selector:  <none>

Or label selector:  <none>

Storage Location:  bsl-o1-726938

Velero-Native Snapshot PVs:  auto
Snapshot Move Data:          false
Data Mover:                  velero

TTL:  720h0m0s

CSISnapshotTimeout:    10m0s
ItemOperationTimeout:  4h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2024-04-14 22:58:27 -0700 PDT
Completed:  2024-04-14 23:26:32 -0700 PDT

Expiration:  2024-05-14 22:58:18 -0700 PDT

Total items to be backed up:  855
Items backed up:              855

Backup Volumes:
  <error getting backup volume info: DownloadRequest.velero.io "instana-backup-vol-2-f3d79a6e-db2f-4bba-91af-19ca40848dd5" is invalid: spec.target.kind: Unsupported value: "BackupVolumeInfos": supported values: "BackupLog", "BackupContents", "BackupVolumeSnapshots", "BackupItemOperations", "BackupResourceList", "BackupResults", "RestoreLog", "RestoreResults", "RestoreResourceList", "RestoreItemOperations", "CSIBackupVolumeSnapshots", "CSIBackupVolumeSnapshotContents">

What did you expect to happen:

The following information will help us better understand what's going on:

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)

Anything else you would like to add:

Environment:

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

dove-young commented 5 months ago

debug file

bundle-2024-04-14-23-28-45.tar.gz

kaovilai commented 5 months ago
  <error getting backup volume info: DownloadRequest.velero.io "instana-backup-vol-2-f3d79a6e-db2f-4bba-91af-19ca40848dd5" is invalid: spec.target.kind: Unsupported value: "BackupVolumeInfos": supported values: "BackupLog", "BackupContents", "BackupVolumeSnapshots", "BackupItemOperations", "BackupResourceList", "BackupResults", "RestoreLog", "RestoreResults", "RestoreResourceList", "RestoreItemOperations", "CSIBackupVolumeSnapshots", "CSIBackupVolumeSnapshotContents">

You might have an outdated CRD in the cluster. Try reinstalling.

github-actions[bot] commented 3 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.

vapetri commented 3 months ago

Hi, is there a resolution for this? I have similar issue on restore stuck in progress state:

NAME                                      BACKUP                            STATUS       STARTED                         COMPLETED   ERRORS   WARNINGS   CREATED                         SELECTOR
restore-nginx-example-ns-backup-w-pvc-1   nginx-example-ns-backup-w-pvc-3   InProgress   2024-06-28 13:03:14 +0000 UTC   <nil>       0        0          2024-06-28 13:03:14 +0000 UTC   <none>

Restore pod with pvc log says:

restore-wait Found the done file /restores/nginx-logs/.velero/f0d60df1-b391-44ee-8354-6765ecb32930                                                                                                    restore-wait All restic restores are done                                                                                                                                                              restore-wait Deleted /restores/nginx-logs/.veleroDone cleanup .velero folder
nginx /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
nginx /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
nginx /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
nginx 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf     
nginx /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
nginx /docker-entrypoint.sh: Configuration complete; ready for start up

And the restore says for the pvc:

kopia Restores:
  Completed:
    nginx-example/nginx-deployment-7f5b87dc57-s5vcl: nginx-logs
  <error getting restore volume info: DownloadRequest.velero.io "restore-nginx-example-ns-backup-w-pvc-1-149fa96d-8e6d-4ce2-9aba-0e69c1d24c82" is invalid: spec.target.kind: Unsupported value: "RestoreVolumeInfo": supported values: "BackupLog", "BackupContents", "BackupVolumeSnapshots", "BackupItemOperations", "BackupResourceList", "BackupResults", "RestoreLog", "RestoreResults", "RestoreResourceList", "RestoreItemOperations", "CSIBackupVolumeSnapshots", "CSIBackupVolumeSnapshotContents", "BackupVolumeInfos">

One more thing, maybe can be also related is the in the same Kubernetes cluster i have also a portwrox instance which might use the same CRDs as Velero.

sseago commented 3 months ago

@vapetri Your CRDs are out-of-date. "RestoreVolumeInfo" was added recently (I think it's new as of 1.14), so if that's invalid, it means that you have the Velero CRDs installed from a previous installation. Velero won't work if the CRDs in-cluster don't match the CRDs for the version of Velero you're using. If this same kubernetes cluster has Velero installed in two different namespaces, then they must both be the same version, since CRDs are cluster-scoped.

vapetri commented 3 months ago

@sseago Hello Scott and thank you for the first reply. I have used the latest velero cli to install and uninstall as i did many times that.

root@velero-admin:~# velero version
Client:
    Version: v1.14.0
    Git commit: 2fc6300f2239f250b40b0488c35feae59520f2d3
Server:
    Version: v1.14.0
root@velero-admin:~# 

i will uninstall again and ensure all crds are deleted. There is only one velero instance running on my kubernetes cluster.

velero uninstall 
You are about to uninstall Velero.
Are you sure you want to continue (Y/N)? y
Waiting for resource with attached finalizer to be deleted
.
Waiting for velero namespace "velero" to be deleted
.....................................................................

Indeed i had to run

kubectl delete crds -l component=velero

to be sure that all crds were uninstalled. After reinstall of velero i managed to restore the nginx-example.

Thank you again,

Smogu93 commented 2 months ago

Author, have you managed to work out a backup solution for Instana on OCP? How can I contact you? I have a few questions. I am currently dealing with this topic myself.

github-actions[bot] commented 1 week ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.