vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.81k stars 1.41k forks source link

velero-restore-helper hangs forever when rebooting cluster twice #8018

Open RobKenis opened 4 months ago

RobKenis commented 4 months ago

What steps did you take and what happened:

The filesystem restore done file /restores/repo1/.velero/5132d6c3-7290-45a2-a04a-f4f851a2642e is not found yet. Retry later.
The filesystem restore done file /restores/repo1/.velero/5132d6c3-7290-45a2-a04a-f4f851a2642e is not found yet. Retry later.

What did you expect to happen:

Either the .velero directory is not cleaned up, this change was made in this change, or the restore-helper can handle multiple reboots.

The following information will help us better understand what's going on:

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

bundle-2024-07-16-15-57-22.tar.gz

Anything else you would like to add:

Environment:

VERSION="9.3 (Shamrock Pampas Cat)"
ID="almalinux"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="AlmaLinux 9.3 (Shamrock Pampas Cat)"
ANSI_COLOR="0;34"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:almalinux:almalinux:9::baseos"
HOME_URL="https://almalinux.org/"
DOCUMENTATION_URL="https://wiki.almalinux.org/"
BUG_REPORT_URL="https://bugs.almalinux.org/"

ALMALINUX_MANTISBT_PROJECT="AlmaLinux-9"
ALMALINUX_MANTISBT_PROJECT_VERSION="9.3"
REDHAT_SUPPORT_PRODUCT="AlmaLinux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.3"

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

reasonerjt commented 4 months ago

@RobKenis Let me clarify when you say Reboot the new server you mean rebooting the api-server right? Is this an expected behavior?

kaovilai commented 4 months ago

Would it be possible for restic restore helper to detect recent reboot during init?

RobKenis commented 3 months ago

@reasonerjt By rebooting the server, I mean running reboot. So the entire OS restarts, including the api-server.