Closed dmrub closed 2 months ago
This is expected.
VolumeSnapshotContent is first patched to Retain so that after backup namespaced resource VolumeSnapshot is removed so that namespace could be deleted without cascading deletion to (now Retained) VolumeSnapshotContent that is needed to restore from backup.
The VolumeSnapshot objects will be removed from the cluster after the backup is uploaded to the object storage, so that the namespace that is backed up can be deleted without removing the snapshot in the storage provider if the DeletionPolicy is Delete.
The only case when volumesnapshotcontent objects will be removed by velero is when backup is expired or deleted.
When the Velero backup expires, the VolumeSnapshot objects will be deleted and the VolumeSnapshotContent objects will be updated to have a DeletionPolicy of Delete, to free space on the storage system.
After the backup, I see that there are no VolumeSnapshots, but there are still VolumeSnapshotContents.
If volumesnapshotcontents are removed as well, velero wouldn't be able to restore your data.
Alternatively you can look into https://velero.io/docs/v1.13/csi-snapshot-data-movement/ which removes the need for retained snapshot on cluster by moving data to object store.
I use snapshot data movement, so I expect "VolumeSnapshotContent" objects to always be deleted after the backup and data movement are complete. Here is my Velero schedule:
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: nginx-example-backup-every-two-hours
namespace: velero
annotations:
velero.io/csi-volumesnapshot-class_disk.csi.cloud.com: "linstor"
spec:
schedule: "10 0,8-20/2 * * 1-6"
template:
csiSnapshotTimeout: 20m
snapshotVolumes: true
snapshotMoveData: true
includedNamespaces:
- "nginx-example"
includedResources:
- "*"
storageLocation: default
volumeSnapshotLocations:
- default
ttl: 168h0m0s
hrm.. I'm no expert, but wouldn't you want to ONLY set snapshotMoveData: true and not snapshotVolumes?
Related PRs: https://github.com/vmware-tanzu/velero/pull/6827 Already fixed issue: https://github.com/vmware-tanzu/velero/issues/6786 Also cherrypicked to 1.12
Ok based on the bundle file provided at kubecapture/velero.io_v1/velero/backups-202403081310.2979.json
you in fact have not enabled snapshotMoveData
on the backup name associated with "leftover" VolumeSnapshotContent, so this falls back to the expected behavior case of not using snapshotMoveData with CSI.
{
"apiVersion": "velero.io/v1",
"kind": "Backup",
"metadata": {
"annotations": {
"velero.io/resource-timeout": "10m0s",
"velero.io/source-cluster-k8s-gitversion": "v1.28.7",
"velero.io/source-cluster-k8s-major-version": "1",
"velero.io/source-cluster-k8s-minor-version": "28"
},
"creationTimestamp": "2024-03-07T18:10:25Z",
"generation": 6,
"labels": {
"kustomize.toolkit.fluxcd.io/name": "stage07",
"kustomize.toolkit.fluxcd.io/namespace": "flux-system",
"velero.io/schedule-name": "velero-backup-every-two-hours",
"velero.io/storage-location": "default"
},
"name": "velero-backup-every-two-hours-20240307181025",
"namespace": "velero",
"resourceVersion": "947441",
"uid": "e714b1df-dea1-445b-bbf5-4d83a50afe01"
},
"spec": {
"csiSnapshotTimeout": "10m0s",
"defaultVolumesToFsBackup": false,
"hooks": {},
"includedNamespaces": [
"velero"
],
"includedResources": [
"*"
],
"itemOperationTimeout": "4h0m0s",
"metadata": {},
"snapshotMoveData": false,
"storageLocation": "default",
"ttl": "168h0m0s",
"volumeSnapshotLocations": [
"default"
]
},
"status": {
"backupItemOperationsAttempted": 2,
"backupItemOperationsCompleted": 2,
"completionTimestamp": "2024-03-07T18:11:26Z",
"csiVolumeSnapshotsAttempted": 1,
"csiVolumeSnapshotsCompleted": 1,
"expiration": "2024-03-14T18:11:14Z",
"formatVersion": "1.1.0",
"hookStatus": {},
"phase": "Completed",
"progress": {
"itemsBackedUp": 152,
"totalItems": 152
},
"startTimestamp": "2024-03-07T18:11:14Z",
"version": 1
}
},
Schedule for this backup on the bundle file provided which is now paused, still did not have snapshotMoveData set which means the backup generated from schedule will not use snapshotMoveData.
{
"apiVersion": "velero.io/v1",
"kind": "Schedule",
"metadata": {
"creationTimestamp": "2024-03-06T10:09:31Z",
"generation": 20,
"labels": {
"kustomize.toolkit.fluxcd.io/name": "stage07",
"kustomize.toolkit.fluxcd.io/namespace": "flux-system"
},
"name": "velero-backup-every-two-hours",
"namespace": "velero",
"resourceVersion": "1453983",
"uid": "36d3badf-b6a4-43ce-b5d6-78a6a4c109cf"
},
"spec": {
"paused": true,
"schedule": "10 0,8-20/2 * * 1-6",
"template": {
"csiSnapshotTimeout": "0s",
"hooks": {},
"includedNamespaces": [
"velero"
],
"includedResources": [
"*"
],
"itemOperationTimeout": "0s",
"metadata": {},
"storageLocation": "default",
"ttl": "168h0m0s",
"volumeSnapshotLocations": [
"default"
]
}
},
"status": {
"lastBackup": "2024-03-08T10:10:26Z",
"phase": "Enabled"
}
}
@kaovilai maybe this issue is due to some weird misconfiguration? I have multiple schedules and only nginx-example should create volume snapshots and move them to S3 storage. However, there is also a velero-backup schedule that should only store one configuration of the velero namespace:
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: velero-backup-every-two-hours
namespace: velero
spec:
schedule: '10 0,8-20/2 * * 1-6'
template:
includedNamespaces:
- 'velero'
includedResources:
- '*'
storageLocation: default
volumeSnapshotLocations:
- default
ttl: 168h0m0s
But many of the operations take place in the velero namespace (e.g. data uploads). Now, after you pointed me to the volumesnapshotcontent, which was without volumesnapshot (I actually missed that its name is velero.io/backup-name=velero-backup-every-two-hours-20240307181025 and not nginx-example...) I looked at the backup and saw that the velero-backup contains CSI snapshots of the nginx-example namespace (see below). In the velero-backup-every-two-hours schedule, neither the snapshotMoveData nor snapshotVolumes properties are set. Why does velero-backup get this snapshot from nginx-example? It looks like a temporary snapshot object as part of a nginx-example backup process !
$ velero describe backup velero-backup-every-two-hours-20240307181025 --details
Name: velero-backup-every-two-hours-20240307181025
Namespace: velero
Labels: kustomize.toolkit.fluxcd.io/name=stage07
kustomize.toolkit.fluxcd.io/namespace=flux-system
velero.io/schedule-name=velero-backup-every-two-hours
velero.io/storage-location=default
Annotations: velero.io/resource-timeout=10m0s
velero.io/source-cluster-k8s-gitversion=v1.28.7
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=28
Phase: Completed
Namespaces:
Included: velero
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Or label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
Snapshot Move Data: false
Data Mover: velero
TTL: 168h0m0s
CSISnapshotTimeout: 10m0s
ItemOperationTimeout: 4h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2024-03-07 19:11:14 +0100 CET
Completed: 2024-03-07 19:11:26 +0100 CET
Expiration: 2024-03-14 19:11:14 +0100 CET
Total items to be backed up: 152
Items backed up: 152
Backup Item Operations:
Operation for volumesnapshots.snapshot.storage.k8s.io velero/velero-nginx-example-backup-every-two-hours-20240307181025w7lhg:
Backup Item Action Plugin: velero.io/csi-volumesnapshot-backupper
Operation ID: velero/velero-nginx-example-backup-every-two-hours-20240307181025w7lhg/2024-03-07T18:11:24Z
Items to Update:
volumesnapshots.snapshot.storage.k8s.io velero/velero-nginx-example-backup-every-two-hours-20240307181025w7lhg
Phase: Completed
Created: 2024-03-07 19:11:24 +0100 CET
Started: 2024-03-07 19:11:24 +0100 CET
Operation for volumesnapshotcontents.snapshot.storage.k8s.io /snapcontent-24a68189-c19b-4a1b-968c-ceaa3e2876a1:
Backup Item Action Plugin: velero.io/csi-volumesnapshotcontent-backupper
Operation ID: snapcontent-24a68189-c19b-4a1b-968c-ceaa3e2876a1/2024-03-07T18:11:24Z
Items to Update:
volumesnapshotcontents.snapshot.storage.k8s.io /snapcontent-24a68189-c19b-4a1b-968c-ceaa3e2876a1
Phase: Completed
Created: 2024-03-07 19:11:24 +0100 CET
Started: 2024-03-07 19:11:24 +0100 CET
Resource List:
apiextensions.k8s.io/v1/CustomResourceDefinition:
- backuprepositories.velero.io
- backups.velero.io
- backupstoragelocations.velero.io
- datauploads.velero.io
- helmreleases.helm.toolkit.fluxcd.io
- prometheusrules.monitoring.coreos.com
- schedules.velero.io
- sealedsecrets.bitnami.com
- servicemonitors.monitoring.coreos.com
- volumesnapshotlocations.velero.io
- volumesnapshots.snapshot.storage.k8s.io
apps/v1/ControllerRevision:
- velero/node-agent-58788bcc87
apps/v1/DaemonSet:
- velero/node-agent
apps/v1/Deployment:
- velero/velero
apps/v1/ReplicaSet:
- velero/velero-db67f5587
bitnami.com/v1alpha1/SealedSecret:
- velero/credentials-velero
discovery.k8s.io/v1/EndpointSlice:
- velero/velero-gn2t9
helm.toolkit.fluxcd.io/v2beta2/HelmRelease:
- velero/velero
monitoring.coreos.com/v1/PrometheusRule:
- velero/velero
monitoring.coreos.com/v1/ServiceMonitor:
- velero/velero
rbac.authorization.k8s.io/v1/ClusterRole:
- cluster-admin
rbac.authorization.k8s.io/v1/ClusterRoleBinding:
- velero-server
rbac.authorization.k8s.io/v1/Role:
- velero/velero-server
rbac.authorization.k8s.io/v1/RoleBinding:
- velero/velero-server
snapshot.storage.k8s.io/v1/VolumeSnapshot:
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt
- velero/velero-nginx-example-backup-every-two-hours-20240307181025w7lhg
snapshot.storage.k8s.io/v1/VolumeSnapshotClass:
- linstor
snapshot.storage.k8s.io/v1/VolumeSnapshotContent:
- nginx-example-backup-every-two-hours-20240307181025-hhlnt
- snapcontent-24a68189-c19b-4a1b-968c-ceaa3e2876a1
v1/ConfigMap:
- velero/kube-root-ca.crt
v1/Endpoints:
- velero/velero
v1/Event:
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8dee766f31c9
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8dee777619cc
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8dee7790cd87
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8dee77e2ccda
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8dee77e3174d
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8dee77e85938
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8deef03eb978
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8def2a542157
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt.17ba8def4e25116b
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8ded4ac7f71e
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8ded4b79ec7e
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8ded4b7a4872
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8ded4bef8378
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8ded4c0d9978
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8ded8aef45bd
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8dedc31f02c3
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8dee50854c35
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8defa82f1086
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8defa9066cdd
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8defac35bcc2
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9.17ba8df03daa309e
v1/Namespace:
- velero
v1/PersistentVolume:
- pvc-71956ca5-daa7-424a-8ffb-78684b7c2ab7
v1/PersistentVolumeClaim:
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt
v1/Pod:
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt
- velero/node-agent-bxdzk
- velero/node-agent-qm9jm
- velero/velero-db67f5587-gbk6q
v1/Secret:
- velero/credentials-velero
- velero/sh.helm.release.v1.velero.v1
- velero/sh.helm.release.v1.velero.v2
- velero/velero
- velero/velero-repo-credentials
v1/Service:
- velero/velero
v1/ServiceAccount:
- velero/default
- velero/velero-server
velero.io/v1/Backup:
- velero/cert-manager-backup-every-two-hours-20240306201024
- velero/cert-manager-backup-every-two-hours-20240307001024
- velero/cert-manager-backup-every-two-hours-20240307081024
- velero/cert-manager-backup-every-two-hours-20240307101025
- velero/cert-manager-backup-every-two-hours-20240307121025
- velero/cert-manager-backup-every-two-hours-20240307141025
- velero/cert-manager-backup-every-two-hours-20240307161025
- velero/cert-manager-backup-every-two-hours-20240307181025
- velero/flux-system-backup-every-two-hours-20240306201024
- velero/flux-system-backup-every-two-hours-20240307001024
- velero/flux-system-backup-every-two-hours-20240307081024
- velero/flux-system-backup-every-two-hours-20240307101024
- velero/flux-system-backup-every-two-hours-20240307121025
- velero/flux-system-backup-every-two-hours-20240307141025
- velero/flux-system-backup-every-two-hours-20240307161025
- velero/flux-system-backup-every-two-hours-20240307181025
- velero/minio-operator-backup-every-two-hours-20240306201024
- velero/minio-operator-backup-every-two-hours-20240307001024
- velero/minio-operator-backup-every-two-hours-20240307081024
- velero/minio-operator-backup-every-two-hours-20240307101024
- velero/minio-operator-backup-every-two-hours-20240307121025
- velero/minio-operator-backup-every-two-hours-20240307141025
- velero/minio-operator-backup-every-two-hours-20240307161025
- velero/minio-operator-backup-every-two-hours-20240307181025
- velero/nginx-example-backup-every-two-hours-20240307101024
- velero/nginx-example-backup-every-two-hours-20240307121025
- velero/nginx-example-backup-every-two-hours-20240307135802
- velero/nginx-example-backup-every-two-hours-20240307141025
- velero/nginx-example-backup-every-two-hours-20240307161025
- velero/nginx-example-backup-every-two-hours-20240307181025
- velero/nginx-linstor-1
- velero/nginx-linstor-12
- velero/nginx-linstor-13
- velero/piraeus-datastore-backup-every-two-hours-20240306201024
- velero/piraeus-datastore-backup-every-two-hours-20240307001024
- velero/piraeus-datastore-backup-every-two-hours-20240307081024
- velero/piraeus-datastore-backup-every-two-hours-20240307101025
- velero/piraeus-datastore-backup-every-two-hours-20240307121025
- velero/piraeus-datastore-backup-every-two-hours-20240307141025
- velero/piraeus-datastore-backup-every-two-hours-20240307161025
- velero/piraeus-datastore-backup-every-two-hours-20240307181025
- velero/traefik-backup-every-two-hours-20240306201024
- velero/traefik-backup-every-two-hours-20240307001024
- velero/traefik-backup-every-two-hours-20240307081024
- velero/traefik-backup-every-two-hours-20240307101025
- velero/traefik-backup-every-two-hours-20240307121025
- velero/traefik-backup-every-two-hours-20240307141025
- velero/traefik-backup-every-two-hours-20240307161025
- velero/traefik-backup-every-two-hours-20240307181025
- velero/velero-backup-every-two-hours-20240306201024
- velero/velero-backup-every-two-hours-20240307001024
- velero/velero-backup-every-two-hours-20240307081024
- velero/velero-backup-every-two-hours-20240307101025
- velero/velero-backup-every-two-hours-20240307121025
- velero/velero-backup-every-two-hours-20240307141025
- velero/velero-backup-every-two-hours-20240307161025
- velero/velero-backup-every-two-hours-20240307181025
velero.io/v1/BackupRepository:
- velero/nginx-example-default-kopia-29q4v
velero.io/v1/BackupStorageLocation:
- velero/default
velero.io/v1/Schedule:
- velero/cert-manager-backup-every-two-hours
- velero/flux-system-backup-every-two-hours
- velero/minio-operator-backup-every-two-hours
- velero/nginx-example-backup-every-two-hours
- velero/piraeus-datastore-backup-every-two-hours
- velero/traefik-backup-every-two-hours
- velero/velero-backup-every-two-hours
velero.io/v1/VolumeSnapshotLocation:
- velero/default
velero.io/v2alpha1/DataUpload:
- velero/nginx-example-backup-every-two-hours-20240307101024-5n5ft
- velero/nginx-example-backup-every-two-hours-20240307101024-kdvqn
- velero/nginx-example-backup-every-two-hours-20240307121025-7s77p
- velero/nginx-example-backup-every-two-hours-20240307121025-df8jq
- velero/nginx-example-backup-every-two-hours-20240307135802-r6h4s
- velero/nginx-example-backup-every-two-hours-20240307135802-xhbmn
- velero/nginx-example-backup-every-two-hours-20240307141025-87fwr
- velero/nginx-example-backup-every-two-hours-20240307141025-j6p89
- velero/nginx-example-backup-every-two-hours-20240307161025-j474w
- velero/nginx-example-backup-every-two-hours-20240307161025-pnllz
- velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt
- velero/nginx-example-backup-every-two-hours-20240307181025-xrqg9
- velero/nginx-linstor-1-5kxg8
- velero/nginx-linstor-1-c94md
- velero/nginx-linstor-12-f9j9x
- velero/nginx-linstor-12-wl8m2
- velero/nginx-linstor-13-2pgmb
- velero/nginx-linstor-13-pxfj2
Backup Volumes:
Velero-Native Snapshots: <none included>
CSI Snapshots:
velero/nginx-example-backup-every-two-hours-20240307181025-hhlnt:
Snapshot:
Operation ID: velero/velero-nginx-example-backup-every-two-hours-20240307181025w7lhg/2024-03-07T18:11:24Z
Snapshot Content Name: snapcontent-24a68189-c19b-4a1b-968c-ceaa3e2876a1
Storage Snapshot ID: snapshot-24a68189-c19b-4a1b-968c-ceaa3e2876a1
Snapshot Size (bytes): 10737418240
CSI Driver: linstor.csi.linbit.com
Pod Volume Backups: <none included>
HooksAttempted: 0
HooksFailed: 0
It looks like a temporary snapshot object as part of a nginx-example backup process !
I don't think backing up velero namespace was ever recommended.
Tho I have not found a dedicated doc saying so. But there have been chats in the past.
In the meantime you can add --include-cluster-resources=false
to this schedule to avoid said issue.
For "syncing schedules" there have been examples of using argocd, we can also reopen https://github.com/vmware-tanzu/velero/issues/2876
Why does velero-backup get this snapshot from nginx-example? It looks like a temporary snapshot object as part of a nginx-example backup process !
Velero does simply what it's being told to do, and it does not have any logic currently that "hey this is velero namespace, treat it differently."
Nothing was actually broken, this is simply code not living up to your expectation but not exactly malfunctioning.
"In the meantime you can add --include-cluster-resources=false to this schedule to avoid said issue." -- if you're backing up volume information, you probably don't want this, you probably want it set to nil (the default value), since that will pull in only relevant cluser resources, but setting it to false will pull none in -- no VSCs, no PVs, etc.
But to answer the original question, the reason VolumeSnapshotContents are not deleted is that if you're not using datamover, if Velero deletes the VSCs after backup, then it won't be able to restore, since the snapshot bits will be removed. With datamover, VSC contents are copied into the BackupStorageLocation, so VSCs can be deleted post-backup, but without DataMover, the VolumeSnapshotContents are not temporary data, they are required for restore to work.
"In the meantime you can add --include-cluster-resources=false to this schedule to avoid said issue." -- if you're backing up volume information, you probably don't want this, you probably want it set to nil (the default value), since that will pull in only relevant cluser resources, but setting it to false will pull none in -- no VSCs, no PVs, etc.
@sseago op wants to backup velero namespace. There's only temporary data mover PVC that they don't care about in the namespace. They probably only want velero.io resources.
@kaovilai @sseago the last comment describes exactly the situation, we just want to backup all configurations in the velero namespace, like schedules, backup locations, etc. and of course avoid errors like this.
You probably want a specific short list of included resources, then, excluding everything else. You'd probably want BackupStorageLocations, Secrets, and Schedules. I don't think you'd want Backups/Restores/etc. since those aren't really useful without the related BSL resources, and once you add a BSL, any backups in that BSL are synced to the cluster for you.
I think based on the latest comment by @sseago a solution has been provided.
Closing this issue.
@reasonerjt I am personally facing the same issue with the latest version of velero (v1.13.2). Seems that it is happening sporadically with different volumes (sometimes our daily backup goes well, but failed the day after). So I'm not quite sure why this issue has been closed as I don't see any solution in the comments (except excluding the volumes, which obviously is not a good solution)
Is it possible that is has something to do with Argo CD autosync?
What steps did you take and what happened:
I have a Velero schedule that creates a backup of the running application at regular intervals, including the creation of CSI snapshots. The VolumeSnapshotClass used has the deletionPolicy set to Delete. After the backup, I see that there are no VolumeSnapshots, but there are still VolumeSnapshotContents.
What did you expect to happen: VolumeSnapshotContents are also deleted.
The following information will help us better understand what's going on:
bundle-2024-03-08-13-10-57.tar.gz
Anything else you would like to add:
I see errors in output of the snapshot-controller
Additional command line output
Environment:
velero version
):velero client config get features
):kubectl version
):/etc/os-release
): Ubuntu 22.04.4 LTSVote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.