openshift / openshift-velero-plugin

General Velero plugin for backup and restore of openshift workloads.
Apache License 2.0
48 stars 38 forks source link

Restore ImageTag failed because of invalid Spec #79

Closed phuongatemc closed 2 years ago

phuongatemc commented 3 years ago

Reproduce steps:

sseago commented 3 years ago

We have an ImageTag plugin which abandons the ImageTag restore to get around this problem. ImageTags serve the same basic function as ImageStreamTags, but my understanding is that ImageTags will eventually supercede them. For the moment, though, creating one creates the other, so there's no reason to restore both of them. For recent MTC versions, we've disabled this plugin to get around a velero bug -- the ImageTag plugin ends up being registered for all types in a cluster where ImageTags don't exist (OCP4.3 and older), which resulted in everything being discarded, not just ImageTags. If you're using a build of openshift-velero-plugin with this disabled, you can get the same effect by including imagetags in the exclude resources list for the backup.

sseago commented 3 years ago

Once we hit a point where there's actually information in the ImageTag that we want to restore that is not handled by restoring the ImageStreamTag, we may need more than just abandoning the ImageTag in the plugin, but for the moment, I haven't seen anything that we lose by just restoring the ImageStreamTag. I do need to go back and figure out why the ImageTag plugin isn't working with velero upstream -- it's also possible that with Velero 1.6 the problem has gone away.

phuongatemc commented 3 years ago

I tested the latest release Velero 1.6 with the OpenShift plugin 1.4.4 and still saw the same error:

oc get pod -n velero-ppdm velero-559ffd7876-7k96q -o yaml

...

sseago commented 3 years ago

@phuongatemc Thanks for letting me know it's still there with 1.6. We first noticed this problem with either velero 1.3 or 1.4 (I can't remember which), and there was an upstream fix that resolved the issue. I suspect that some upstream refactoring reintroduced the bug. I need to set aside some time later this week to find and fix it again upstream and submit an upstream PR.

phuongatemc commented 3 years ago

I notice in the log that there are 2 ImageTags being restored. The first one restore without problem. The second ImageTag failed. I also notice that it execute the "common-restore" instead of the "imagetag-restore". The source code of ImageTag plugin shows it returns "velero.NewRestoreItemActionExecuteOutput(input.Item).WithoutRestore(), nil" but it looks like this ImageTag function is not executed. Here is the log: time="2021-05-04T16:46:39Z" level=info msg="Getting client for image.openshift.io/v1, Kind=ImageTag" logSource="pkg/restore/restore.go:768" restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:39Z" level=info msg="Executing item action for imagetags.image.openshift.io" logSource="pkg/restore/restore.go:1002" restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:39Z" level=info msg="[common-restore] Entering common restore plugin" cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/restore.go:23" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:39Z" level=info msg="[common-restore] common restore plugin for ruby-22-centos7:latest" cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/restore.go:30" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="[GetRegistryInfo] value from imagestream" cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/shared.go:35" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="[common-restore] Error getting registry route, assuming this is outside of OADP context." cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/restore.go:52" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="Attempting to restore ImageTag: ruby-22-centos7:latest" logSource="pkg/restore/restore.go:1107" restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="Executing item action for imagetags.image.openshift.io" logSource="pkg/restore/restore.go:1002" restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="[common-restore] Entering common restore plugin" cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/restore.go:23" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="[common-restore] common restore plugin for ruby-hello-world:latest" cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/restore.go:30" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="[GetRegistryInfo] value from imagestream" cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/shared.go:35" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="[common-restore] Error getting registry route, assuming this is outside of OADP context." cmd=/plugins/velero-plugins logSource="/go/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/restore.go:52" pluginName=velero-plugins restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="Attempting to restore ImageTag: ruby-hello-world:latest" logSource="pkg/restore/restore.go:1107" restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="error restoring ruby-hello-world:latest: ImageTag.image.openshift.io \"ruby-hello-world:latest\" is invalid: spec: Required value: spec is a required field during creation" logSource="pkg/restore/restore.go:1170" restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5 time="2021-05-04T16:46:40Z" level=info msg="Restoring resource 'replicationcontrollers' into namespace 'ph-hello-restore-5'" logSource="pkg/restore/restore.go:724" restore=velero-ppdm/aed4b4d2-cf98-533e-b531-de235a1b8213-2021-05-04-09-46-34-ph-hello-restore-5

openshift-bot commented 2 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 2 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot commented 2 years ago

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci[bot] commented 2 years ago

@openshift-bot: Closing this issue.

In response to [this](https://github.com/openshift/openshift-velero-plugin/issues/79#issuecomment-968139592): >Rotten issues close after 30d of inactivity. > >Reopen the issue by commenting `/reopen`. >Mark the issue as fresh by commenting `/remove-lifecycle rotten`. >Exclude this issue from closing again by commenting `/lifecycle frozen`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.