openshift / openshift-velero-plugin

General Velero plugin for backup and restore of openshift workloads.
Apache License 2.0
48 stars 38 forks source link

OADP-1004: Pods with a restic backup are excluded from being restored when they have no backup annotation #96

Closed bo0ts closed 1 year ago

bo0ts commented 3 years ago

I've reported https://github.com/openshift/oadp-operator/issues/158 but it seems that this is a issue with the openshift-velero-plugin. Feel free to ask for additional information if the linked issue isn't sufficient.

sseago commented 3 years ago

@bo0ts Yes, this is an issue in the plugin. Pods with owner references are normally excluded from the restore when they have owner references (since the deployment, etc. will recreate them anyway). But we make an exception to this for pods with restic annotations -- at the time this plugin code was developed, restic annotations were the only way to trigger restic backups. The "default to restic" option is newer, and the plugin didn't follow through. We need to also make an exception for all pods with volumes if the "default to restic" option is enabled.

DoGab commented 2 years ago

I have encountered the same problem. The volume was correctly backuped using restic into the S3 bucket and a podvolumebackup resource was created. But after the restore no podvolumerestore resource was created and the volume was not restored. When will this be fixed?

When annotating the pods like in previous velero versions and described here the restore works.

madchr1st commented 2 years ago

I am hitting the same issue.

When using defaultVolumesToRestic in the backup CR, the volumes are backed up correctly with restic, but the volumes are not restored afterwards (only recreated, empty).

With the restore only working with the use of pod annotations, this is IMHO not suitable for production at the moment.

Is there any way we can support you in resolving this issue?

openshift-bot commented 2 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 2 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

DoGab commented 2 years ago

/remove-lifecycle rotten

DoGab commented 2 years ago

When is this going to be fixed? Still an issue with the "stable" version of ADP.

kaovilai commented 2 years ago

volumes are not restored afterwards (only recreated, empty).

I wonder if the pod that was restored with data got killed in favor of a different pod created by owners (deployment etc.)

louise-zhang commented 2 years ago

Hi there, is there any update on this issue? I am experiencing the same issue as @DoGab https://github.com/openshift/openshift-velero-plugin/issues/96#issuecomment-927847343

kaovilai commented 2 years ago

For those that do not require restic, try to use CSI snapshot instead which is working now.

Example in oadp blog

louise-zhang commented 2 years ago

I think this issue has been resolved with latest OADP operator: oadp-operator.v1.0.2 Have tested backup and restore volume without adding pod annotations, backup and restore are successful. (it didn't work with oadp-operator.v1.0.1)

VphDreamer commented 2 years ago

Hello, I am having the same issue using oadp-operator v1.0.3

openshift-bot commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 1 year ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

kaovilai commented 1 year ago

/remove-lifecycle rotten. /lifecycle frozen

kaovilai commented 1 year ago

Can you all confirm if problem exists in 1.1?

kaovilai commented 1 year ago

IIUC this was fixed in https://github.com/openshift/openshift-velero-plugin/pull/102 and is duped of https://github.com/openshift/oadp-operator/issues/74 closing.