Open Naveen-Kamagani opened 2 weeks ago
This could happen if you triggered a backup and when the backup was in-progress, the velero server pod was restarted, in such scenarios the backup is marked as Failed
I uninstalled and installed the OADP operator and ran the backup manually,
apiVersion: velero.io/v1
kind: Backup
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: >
{"apiVersion":"velero.io/v1","kind":"Backup","metadata":{"annotations":{},"name":"bcdr-scheduler-ondemand-vol-3","namespace":"oadp-velero"},"spec":{"csiSnapshotTimeout":"10m0s","defaultVolumesToFsBackup":false,"excludedNamespaces":["openshift","openshift-apiserver","openshift-apiserver-operator","openshift-authentication","openshift-authentication-operator","openshift-cloud-credential-operator","openshift-cluster-machine-approver","openshift-cluster-node-tuning-operator","openshift-cluster-samples-operator","openshift-cluster-storage-operator","openshift-cluster-version","openshift-config","openshift-config-managed","openshift-console","openshift-console-operator","openshift-controller-manager","openshift-controller-manager-operator","openshift-dns","openshift-dns-operator","openshift-etcd","openshift-image-registry","openshift-infra","openshift-ingress","openshift-ingress-operator","openshift-insights","openshift-kni-infra","openshift-kube-apiserver","openshift-kube-apiserver-operator","openshift-kube-controller-manager","openshift-kube-controller-manager-operator","openshift-kube-proxy","openshift-kube-scheduler","openshift-kube-scheduler-operator","openshift-machine-api","openshift-machine-config-operator","openshift-monitoring","openshift-multus","openshift-network-operator","openshift-node","openshift-openstack-infra","openshift-operator-lifecycle-manager","openshift-ovirt-infra","openshift-service-ca","openshift-service-ca-operator","openshift-service-catalog-apiserver-operator","openshift-service-catalog-controller-manager-operator","openshift-user-workload-monitoring","velero"],"excludedResources":["storageclasses.storage.k8s.io","imagestreams.image.openshift.io"],"hooks":{},"includeClusterResources":true,"includedNamespaces":["*"],"includedResources":["*"],"snapshotVolumes":true,"storageLocation":"bcdr-s3-location","ttl":"504h0m0s","volumeSnapshotLocations":["bcdr-volumesnapshot-location"]}}
velero.io/resource-timeout: 10m0s
velero.io/source-cluster-k8s-gitversion: v1.26.15+4818370
velero.io/source-cluster-k8s-major-version: '1'
velero.io/source-cluster-k8s-minor-version: '26'
resourceVersion: '825278564'
name: bcdr-scheduler-ondemand-vol-3
uid: 30526dba-4135-4f54-98a5-13004c567634
creationTimestamp: '2024-06-12T13:50:33Z'
generation: 3
managedFields:
- apiVersion: velero.io/v1
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
.: {}
'f:kubectl.kubernetes.io/last-applied-configuration': {}
'f:spec':
'f:snapshotVolumes': {}
'f:excludedResources': {}
'f:includedNamespaces': {}
'f:volumeSnapshotLocations': {}
.: {}
'f:includedResources': {}
'f:defaultVolumesToFsBackup': {}
'f:excludedNamespaces': {}
'f:ttl': {}
'f:csiSnapshotTimeout': {}
'f:storageLocation': {}
'f:hooks': {}
'f:includeClusterResources': {}
manager: kubectl-client-side-apply
operation: Update
time: '2024-06-12T13:50:33Z'
- apiVersion: velero.io/v1
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
'f:velero.io/resource-timeout': {}
'f:velero.io/source-cluster-k8s-gitversion': {}
'f:velero.io/source-cluster-k8s-major-version': {}
'f:velero.io/source-cluster-k8s-minor-version': {}
'f:labels':
.: {}
'f:velero.io/storage-location': {}
'f:spec':
'f:itemOperationTimeout': {}
'f:snapshotMoveData': {}
'f:status':
.: {}
'f:completionTimestamp': {}
'f:expiration': {}
'f:failureReason': {}
'f:formatVersion': {}
'f:phase': {}
'f:startTimestamp': {}
'f:version': {}
manager: velero-server
operation: Update
time: '2024-06-12T13:53:13Z'
namespace: oadp-velero
labels:
velero.io/storage-location: bcdr-s3-location
spec:
volumeSnapshotLocations:
- bcdr-volumesnapshot-location
defaultVolumesToFsBackup: false
excludedNamespaces:
- openshift
- openshift-apiserver
- openshift-apiserver-operator
- openshift-authentication
- openshift-authentication-operator
- openshift-cloud-credential-operator
- openshift-cluster-machine-approver
- openshift-cluster-node-tuning-operator
- openshift-cluster-samples-operator
- openshift-cluster-storage-operator
- openshift-cluster-version
- openshift-config
- openshift-config-managed
- openshift-console
- openshift-console-operator
- openshift-controller-manager
- openshift-controller-manager-operator
- openshift-dns
- openshift-dns-operator
- openshift-etcd
- openshift-image-registry
- openshift-infra
- openshift-ingress
- openshift-ingress-operator
- openshift-insights
- openshift-kni-infra
- openshift-kube-apiserver
- openshift-kube-apiserver-operator
- openshift-kube-controller-manager
- openshift-kube-controller-manager-operator
- openshift-kube-proxy
- openshift-kube-scheduler
- openshift-kube-scheduler-operator
- openshift-machine-api
- openshift-machine-config-operator
- openshift-monitoring
- openshift-multus
- openshift-network-operator
- openshift-node
- openshift-openstack-infra
- openshift-operator-lifecycle-manager
- openshift-ovirt-infra
- openshift-service-ca
- openshift-service-ca-operator
- openshift-service-catalog-apiserver-operator
- openshift-service-catalog-controller-manager-operator
- openshift-user-workload-monitoring
- velero
csiSnapshotTimeout: 10m0s
includedResources:
- '*'
ttl: 504h0m0s
itemOperationTimeout: 4h0m0s
storageLocation: bcdr-s3-location
hooks: {}
includeClusterResources: true
includedNamespaces:
- '*'
snapshotVolumes: true
excludedResources:
- storageclasses.storage.k8s.io
- imagestreams.image.openshift.io
snapshotMoveData: false
status:
completionTimestamp: '2024-06-12T13:53:13Z'
expiration: '2024-07-03T13:50:33Z'
failureReason: >-
found a backup with status "InProgress" during the server starting, mark it
as "Failed"
formatVersion: 1.1.0
phase: Failed
startTimestamp: '2024-06-12T13:50:40Z'
version: 1
There are no backups triggered but still the backup is going into failed state.
@shubham-pampattiwar I made sure no backups are triggered while this backup is running but still gets the same issue
Can some one help me on this issue?
time="2024-06-13T12:03:42Z" level=info msg="All Velero custom resource definitions exist" logSource="/remote-source/velero/app/pkg/cmd/server/server.go:513"
time="2024-06-13T12:03:42Z" level=warning msg="Velero node agent not found; pod volume backups/restores will not work until it's created" logSource="/remote-source/velero/app/pkg/cmd/server/server.go:585"
time="2024-06-13T12:03:46Z" level=warning msg="found a backup with status \"InProgress\" during the server starting, mark it as \"Failed\"" backup=bcdr-scheduler-20240613120039 logSource="/remote-source/velero/app/pkg/cmd/server/server.go:1077"
time="2024-06-13T12:03:46Z" level=info msg="Starting controllers" logSource="/remote-source/velero/app/pkg/cmd/server/server.go:640"
helped via velero issue
OADP Operator version is v1.3.2
velero log -