Closed dove-young closed 3 weeks ago
Here are what I did
[root@api.o1-726938]# velero backup get test-backup
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
test-backup Completed 0 0 2024-04-11 20:30:39 -0700 PDT 29d default <none>
[root@api.o1-726938]#
[root@api.o1-726938]# velero restore describe test-backup-20240411204147
Name: test-backup-20240411204147
Namespace: velero
Labels: <none>
Annotations: <none>
Phase: Failed (run 'velero restore logs test-backup-20240411204147' for more information)
Total items to be restored: 7848
Items restored: 400
Started: 2024-04-11 20:41:47 -0700 PDT
Completed: 2024-04-11 20:43:09 -0700 PDT
Backup: test-backup
Namespaces:
Included: all namespaces found in the backup
Excluded: <none>
Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io
Cluster-scoped: auto
Namespace mappings: <none>
Label selector: <none>
Or label selector: <none>
Restore PVs: auto
Existing Resource Policy: <none>
ItemOperationTimeout: 4h0m0s
Preserve Service NodePorts: auto
[root@api.o1-726938]#
[root@api.o1-726938]# velero backup describe test-backup
Name: test-backup
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/resource-timeout=10m0s
velero.io/source-cluster-k8s-gitversion=v1.28.7+f1b5f6c
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=28
Phase: Completed
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Or label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
Snapshot Move Data: false
Data Mover: velero
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s
ItemOperationTimeout: 4h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2024-04-11 20:30:39 -0700 PDT
Completed: 2024-04-11 20:32:05 -0700 PDT
Expiration: 2024-05-11 20:30:39 -0700 PDT
Total items to be backed up: 8983
Items backed up: 8983
Backup Volumes:
Velero-Native Snapshots: <none included>
CSI Snapshots: <none included>
Pod Volume Backups: <none included>
HooksAttempted: 0
HooksFailed: 0
[root@api.o1-726938]#
I can find files in my minio bucket
[root@api.o1-726938]# mc ls st2-tunnel-dev1/veleco-hgliu/backups/test-backup/
[2024-04-11 20:32:03 PDT] 29B STANDARD test-backup-csi-volumesnapshotclasses.json.gz
[2024-04-11 20:32:03 PDT] 29B STANDARD test-backup-csi-volumesnapshotcontents.json.gz
[2024-04-11 20:32:04 PDT] 29B STANDARD test-backup-csi-volumesnapshots.json.gz
[2024-04-11 20:32:02 PDT] 27B STANDARD test-backup-itemoperations.json.gz
[2024-04-11 20:31:45 PDT] 234KiB STANDARD test-backup-logs.gz
[2024-04-11 20:32:02 PDT] 29B STANDARD test-backup-podvolumebackups.json.gz
[2024-04-11 20:32:02 PDT] 69KiB STANDARD test-backup-resource-list.json.gz
[2024-04-11 20:32:04 PDT] 49B STANDARD test-backup-results.gz
[2024-04-11 20:32:03 PDT] 1.2KiB STANDARD test-backup-volumeinfo.json.gz
[2024-04-11 20:32:02 PDT] 29B STANDARD test-backup-volumesnapshots.json.gz
[2024-04-11 20:32:01 PDT] 22MiB STANDARD test-backup.tar.gz
[2024-04-11 20:32:05 PDT] 2.9KiB STANDARD velero-backup.json
[root@api.o1-726938]#
I created a new backup and tried again, the restore failed with the same error.
Backup
[root@st2-bastion-dev21]# velero backup describe test-backup-2
Name: test-backup-2
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/resource-timeout=10m0s
velero.io/source-cluster-k8s-gitversion=v1.28.7+f1b5f6c
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=28
Phase: Completed
Namespaces:
Included: *
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Or label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
Snapshot Move Data: false
Data Mover: velero
TTL: 720h0m0s
CSISnapshotTimeout: 10m0s
ItemOperationTimeout: 4h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2024-04-11 22:20:47 -0700 PDT
Completed: 2024-04-11 22:22:04 -0700 PDT
Expiration: 2024-05-11 22:20:47 -0700 PDT
Total items to be backed up: 9426
Items backed up: 9426
Backup Volumes:
Velero-Native Snapshots: <none included>
CSI Snapshots: <none included>
Pod Volume Backups: <none included>
HooksAttempted: 0
HooksFailed: 0
Restore
Restore completed with status: Failed. You may check for more information using the commands `velero restore describe test-backup-2-20240411222311` and `velero restore logs test-backup-2-20240411222311`.
[root@api.o1-726938]# velero restore logs test-backup-2-20240411222311
An error occurred: file not found
[root@api.o1-726938]# velero restore describe test-backup-2-20240411222311
Name: test-backup-2-20240411222311
Namespace: velero
Labels: <none>
Annotations: <none>
Phase: Failed (run 'velero restore logs test-backup-2-20240411222311' for more information)
Total items to be restored: 7849
Items restored: 241
Started: 2024-04-11 22:23:11 -0700 PDT
Completed: 2024-04-11 22:24:05 -0700 PDT
Backup: test-backup-2
Namespaces:
Included: all namespaces found in the backup
Excluded: <none>
Resources:
Included: *
Excluded: nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io
Cluster-scoped: auto
Namespace mappings: <none>
Label selector: <none>
Or label selector: <none>
Restore PVs: auto
Existing Resource Policy: <none>
ItemOperationTimeout: 4h0m0s
Preserve Service NodePorts: auto
[root@api.o1-726938]# mc ls st2-tunnel-dev1/veleco-hgliu/backups/test-backup-2
[2024-04-11 22:22:02 PDT] 29B STANDARD test-backup-2-csi-volumesnapshotclasses.json.gz
[2024-04-11 22:22:02 PDT] 29B STANDARD test-backup-2-csi-volumesnapshotcontents.json.gz
[2024-04-11 22:22:01 PDT] 29B STANDARD test-backup-2-csi-volumesnapshots.json.gz
[2024-04-11 22:22:01 PDT] 27B STANDARD test-backup-2-itemoperations.json.gz
[2024-04-11 22:21:48 PDT] 247KiB STANDARD test-backup-2-logs.gz
[2024-04-11 22:22:03 PDT] 29B STANDARD test-backup-2-podvolumebackups.json.gz
[2024-04-11 22:22:02 PDT] 73KiB STANDARD test-backup-2-resource-list.json.gz
[2024-04-11 22:22:02 PDT] 49B STANDARD test-backup-2-results.gz
[2024-04-11 22:22:03 PDT] 1.1KiB STANDARD test-backup-2-volumeinfo.json.gz
[2024-04-11 22:22:03 PDT] 29B STANDARD test-backup-2-volumesnapshots.json.gz
[2024-04-11 22:22:01 PDT] 22MiB STANDARD test-backup-2.tar.gz
[2024-04-11 22:22:04 PDT] 2.9KiB STANDARD velero-backup.json
[root@api.o1-726938]#
from velero pod log, I can find there are lots of restore records, but at the end of them, there are also some failures like
Failed to watch *unstructured.Unstructured: the server does not allow this method on the requested resource
time="2024-04-12T07:08:10Z" level=info msg="Resource 'statefulsets.apps' will be restored into namespace 'instana-clickhouse'" logSource="pkg/restore/restore.go:2264" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'storages.operator.openshift.io' will be restored at cluster scope" logSource="pkg/restore/restore.go:2266" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'storageversionmigrations.migration.k8s.io' will be restored at cluster scope" logSource="pkg/restore/restore.go:2266" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'strimzipodsets.core.strimzi.io' will be restored into namespace 'instana-kafka'" logSource="pkg/restore/restore.go:2264" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'templates.template.openshift.io' will be restored into namespace 'openshift'" logSource="pkg/restore/restore.go:2264" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'tuneds.tuned.openshift.io' will be restored into namespace 'openshift-cluster-node-tuning-operator'" logSource="pkg/restore/restore.go:2264" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'units.instana.io' will be restored into namespace 'instana-units'" logSource="pkg/restore/restore.go:2264" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'validatingwebhookconfigurations.admissionregistration.k8s.io' will be restored at cluster scope" logSource="pkg/restore/restore.go:2266" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Skipping restore of resource because the restore spec excludes it" logSource="pkg/restore/restore.go:2202" resource=volumeattachments.storage.k8s.io restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Resource 'zookeeperclusters.zookeeper.pravega.io' will be restored into namespace 'instana-clickhouse'" logSource="pkg/restore/restore.go:2264" restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:2185" resource=clusterbootstraps.run.tanzu.vmware.com restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:2185" resource=clusters.cluster.x-k8s.io restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="Skipping restore of resource because it cannot be resolved via discovery" logSource="pkg/restore/restore.go:2185" resource=clusterresourcesets.addons.cluster.x-k8s.io restore=velero/test-backup-2-20240412000742
time="2024-04-12T07:08:10Z" level=info msg="waiting informer cache sync ..." logSource="pkg/restore/restore.go:604" restore=velero/test-backup-2-20240412000742
E0412 07:08:10.381344 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.6/tools/cache/reflector.go:169: Failed to watch *unstructured.Unstructured: the server does not allow this method on the requested resource
E0412 07:08:10.392096 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.6/tools/cache/reflector.go:169: Failed to watch *unstructured.Unstructured: the server does not allow this method on the requested resource
I0412 07:08:11.273994 1 request.go:690] Waited for 1.089987061s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/monitoring.coreos.com/v1/servicemonitors?limit=500&resourceVersion=0
E0412 07:08:11.289286 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.6/tools/cache/reflector.go:169: Failed to watch *unstructured.Unstructured: the server does not allow this method on the requested resource
E0412 07:08:11.897525 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.6/tools/cache/reflector.go:169: Failed to watch *unstructured.Unstructured: the server does not allow this method on the requested resource
I checked clusterrolebinding, it binds to cluster admin. it sounds good.
[root@api.o1-726938~]# oc get ClusterRoleBinding/velero
NAME ROLE AGE
velero ClusterRole/cluster-admin 52m
[root@api.o1-726938 ~]#
Looking at the bundle you uploaded, it looks like the Velero pod restarted while your restore was running, so on restart, velero marked it as failed since velero can't restart an in-progress backup or restore if it crashes in the middle. You'd need to rerun the restore. The reason the velero restore logs
command failed is that restore logs are only uploaded upon completion of restore, so since the restore never finished, the logs weren't uploaded.
Next step would be to figure out why your velero pod crashed. Does it restart often" Did the restore cause OOM issues? If so, maybe bumping up memory/cpu resources in your velero deployment would help.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.
This issue was closed because it has been stalled for 14 days with no activity.
What steps did you take and what happened:
I restored a backup from OCP cluster to another OCP cluster, the restore failed after a few moment. My storage is Minio
I checked
velero restore logs
, it saidAn error occurred: file not found
What did you expect to happen:
restore will finish success
The following information will help us better understand what's going on:
bundle-2024-04-11-20-51-32.tar.gz
If you are using velero v1.7.0+:
Please use
velero debug --backup <backupname> --restore <restorename>
to generate the support bundle, and attach to this issue, more options please refer tovelero debug --help
If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename>
orkubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>
Anything else you would like to add:
Environment:
velero version
): v1.13.1velero client config get features
): features:kubectl version
): Client Version: v1.28.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.28.7+6e2789b/etc/os-release
): RHEL 8Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.