openshift / openshift-velero-plugin

General Velero plugin for backup and restore of openshift workloads.
Apache License 2.0
47 stars 37 forks source link

Fail To Restore ImageStreamTags: Error retrieving cluster version of imagestreamtags #204

Closed ihcsim closed 5 months ago

ihcsim commented 9 months ago

While attempting to restore imagestreamtags using OADP 1.1.3 on OCP 4.13.9, my restore fail with the following errors: Error retrieving cluster version of imagestreamtags.image.openshift.io.

This is what the relevant log lines look like:

time="2023-09-15T17:51:09Z" level=info msg="Executing item action for imagestreamtags.image.openshift.io" logSource="/remote-source/velero/app/pkg/restore/restore.go:1153" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2
023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="[common-restore] Entering common restore plugin" cmd=/plugins/velero-plugins logSource="/remote-source/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/restore.go:49
" pluginName=velero-plugins restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="[common-restore] common restore plugin for postgresql:13-el9" cmd=/plugins/velero-plugins logSource="/remote-source/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/common/
restore.go:56" pluginName=velero-plugins restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="Executing item action for imagestreamtags.image.openshift.io" logSource="/remote-source/velero/app/pkg/restore/restore.go:1153" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2
023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="[istag-restore] Entering ImageStreamTag restore plugin" cmd=/plugins/velero-plugins logSource="/remote-source/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/imagestreamta
g/restore.go:30" pluginName=velero-plugins restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="[istag-restore] Restoring imagestreamtag postgresql:13-el9" cmd=/plugins/velero-plugins logSource="/remote-source/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/imagestre
amtag/restore.go:39" pluginName=velero-plugins restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="[istag-restore] backup internal registry: \"\"" cmd=/plugins/velero-plugins logSource="/remote-source/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/imagestreamtag/restor
e.go:42" pluginName=velero-plugins restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="[istag-restore] Reference tag: DockerImage, tag: registry.redhat.io/rhel9/postgresql-13:latest" cmd=/plugins/velero-plugins logSource="/remote-source/src/github.com/konveyor/openshift-ve
lero-plugin/velero-plugins/imagestreamtag/restore.go:64" pluginName=velero-plugins restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="[istag-restore] Restoring reference or remote imagestreamtag" cmd=/plugins/velero-plugins logSource="/remote-source/src/github.com/konveyor/openshift-velero-plugin/velero-plugins/imagest
reamtag/restore.go:92" pluginName=velero-plugins restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="Attempting to restore ImageStreamTag: postgresql:13-el9" logSource="/remote-source/velero/app/pkg/restore/restore.go:1257" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-0
9-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=error msg="Error retrieving cluster version of isim-dev-restored/postgresql:13-el9: imagestreamtags.image.openshift.io \"postgresql:13-el9\" not found" logSource="/remote-source/velero/app/pkg/res
tore/restore.go:1285" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:09Z" level=info msg="Restored 21 items out of an estimated total of 38 (estimate will change throughout the restore)" logSource="/remote-source/velero/app/pkg/restore/restore.go:681" name="postgresql:13-el9"
 namespace=isim-dev-restored progress= resource=imagestreamtags.image.openshift.io restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored

Although the Restore was completed with the PartiallyFailed phase:

apiVersion: velero.io/v1
kind: Restore
metadata:
  creationTimestamp: "2023-09-15T17:50:20Z"
  generation: 4
  name: 1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
  namespace: velero-ppdm
  resourceVersion: "17131252"
  uid: a1adb8ba-c3c8-4318-840d-39b58c34ee4e
spec:
  backupName: isim-dev-2023-09-15-10-49-25-isim-dev
  excludedResources:
  - persistentvolumeclaims
  - persistentvolumes
  - services
  - nodes
  - events
  - events.events.k8s.io
  - backups.velero.io
  - restores.velero.io
  - resticrepositories.velero.io
  - csinodes.storage.k8s.io
  - volumeattachments.storage.k8s.io
  hooks: {}
  includeClusterResources: false
  namespaceMapping:
    isim-dev: isim-dev-restored
  restorePVs: false
status:
  completionTimestamp: "2023-09-15T17:51:11Z"
  errors: 10
  phase: PartiallyFailed
  progress:
    itemsRestored: 38
    totalItems: 38
  startTimestamp: "2023-09-15T17:51:09Z"
  warnings: 6

I could see that the image streams and image stream tags were restored to the target namespace:

$ oc  -n isim-dev-restored get is,istag                                                                                                                                                                             
NAME                                        IMAGE REPOSITORY   TAGS                                            UPDATED
imagestream.image.openshift.io/postgresql                      10,10-el7,10-el8,12,12-el7,12-el8 + 4 more...   3 minutes ago

NAME                                                  IMAGE REFERENCE                                                                                                        UPDATED
imagestreamtag.image.openshift.io/postgresql:13-el9   registry.redhat.io/rhel9/postgresql-13@sha256:cb66b03ce8f01094dabe250c7b117e075e7e19a85b879721a4ab84b29c149d23         3 minutes ago
imagestreamtag.image.openshift.io/postgresql:latest   registry.redhat.io/rhel8/postgresql-13@sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5         3 minutes ago
imagestreamtag.image.openshift.io/postgresql:10-el7   registry.redhat.io/rhscl/postgresql-10-rhel7@sha256:9e1c9c22d84a95622edb84a8eb870267d126416ec03518e20894759f68bc9dae   3 minutes ago
imagestreamtag.image.openshift.io/postgresql:10-el8   registry.redhat.io/rhel8/postgresql-10@sha256:d8cb073b1468188422711f877462777ca2c99779af2e5f693b257b2ecdc946c9         3 minutes ago
imagestreamtag.image.openshift.io/postgresql:12       registry.redhat.io/rhscl/postgresql-12-rhel7@sha256:1fb56a2e5d37f77d932d010196367610519fde574761165b2deb82205c01d218   3 minutes ago
imagestreamtag.image.openshift.io/postgresql:12-el7   registry.redhat.io/rhscl/postgresql-12-rhel7@sha256:1fb56a2e5d37f77d932d010196367610519fde574761165b2deb82205c01d218   3 minutes ago
imagestreamtag.image.openshift.io/postgresql:12-el8   registry.redhat.io/rhel8/postgresql-12@sha256:00760eb8028eb38b66aa6defbd68e2a008dd0422badc59c989bfc4a587210751         3 minutes ago
imagestreamtag.image.openshift.io/postgresql:13-el8   registry.redhat.io/rhel8/postgresql-13@sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5         3 minutes ago
imagestreamtag.image.openshift.io/postgresql:10       registry.redhat.io/rhscl/postgresql-10-rhel7@sha256:9e1c9c22d84a95622edb84a8eb870267d126416ec03518e20894759f68bc9dae   3 minutes ago
imagestreamtag.image.openshift.io/postgresql:13-el7   registry.redhat.io/rhscl/postgresql-13-rhel7@sha256:e6f327f379a4846e4bbb8fda2f7cb575f5620d2991311677702b76829154db56   3 minutes ago

I am just wondering why Velero would report the imagestreamtags as not found here, when it does look like the imagestreamtags are restored.

The error happens to all the tags of the image:

time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:10-el7\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20310 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:10-el8\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20311 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:10\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_controll  20312 er.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:12-el7\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20313 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:12-el8\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20314 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:12\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_controll  20315 er.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:13-el7\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20316 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:13-el8\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20317 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:13-el9\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20318 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored
time="2023-09-15T17:51:10Z" level=error msg="Namespace isim-dev-restored, resource restore error: imagestreamtags.image.openshift.io \"postgresql:latest\" not found" logSource="/remote-source/velero/app/pkg/controller/restore_cont  20319 roller.go:510" restore=velero/1a885c89-d880-5472-ac25-04e823668177-2023-09-15-10-51-25-isim-dev-restored

FWIW, the pgsql imagestream used for testing is one from the out-of-box openshift namespace, without any customization in the spec. I also tried with different images, and all failed with the same error.

(expand to see YAML) ```yaml apiVersion: image.openshift.io/v1 kind: ImageStream metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"image.openshift.io/v1","kind":"ImageStream","metadata":{"annotations":{"openshift.io/display-name":"PostgreSQL","openshift.io/image.dockerRepositoryCheck":"2023-08-25T22:55:02Z","samples.operator.openshift.io/version":"4.13.9"},"creationTimestamp":"2023-08-25T22:54:49Z","generation":2,"labels":{"samples.operator.openshift.io/managed":"true"},"name":"postgresql","namespace":"isim-dev","resourceVersion":"24769","uid":"adb4ae18-674f-41f4-a86b-aee83194522b"},"spec":{"lookupPolicy":{"local":false},"tags":[{"annotations":{"description":"Provides a PostgreSQL 10 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL (Ephemeral) 10","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql,hidden","version":"10"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhscl/postgresql-10-rhel7:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"10","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 10 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL 10 (RHEL 7)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql","version":"10"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhscl/postgresql-10-rhel7:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"10-el7","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 10 database on RHEL 8. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL 10 (RHEL 8)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql","version":"10"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhel8/postgresql-10:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"10-el8","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 12 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL (Ephemeral) 12","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql,hidden","version":"12"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhscl/postgresql-12-rhel7:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"12","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 12 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL 12 (RHEL 7)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql","version":"12"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhscl/postgresql-12-rhel7:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"12-el7","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 12 database on RHEL 8. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL 12 (RHEL 8)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql","version":"12"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhel8/postgresql-12:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"12-el8","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 13 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL 13 (RHEL 7)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql","version":"13"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhscl/postgresql-13-rhel7:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"13-el7","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 13 database on RHEL 8. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL 13 (RHEL 8)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql","version":"13"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhel8/postgresql-13:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"13-el8","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL 13 database on RHEL 9. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL 13 (RHEL 9)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql","version":"13"},"from":{"kind":"DockerImage","name":"registry.redhat.io/rhel9/postgresql-13:latest"},"generation":2,"importPolicy":{"importMode":"Legacy"},"name":"13-el9","referencePolicy":{"type":"Local"}},{"annotations":{"description":"Provides a PostgreSQL database on RHEL. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md.\n\nWARNING: By selecting this tag, your application will automatically update to use the latest version of PostgreSQL available on OpenShift, including major version updates.","iconClass":"icon-postgresql","openshift.io/display-name":"PostgreSQL (Latest)","openshift.io/provider-display-name":"Red Hat, Inc.","tags":"database,postgresql"},"from":{"kind":"ImageStreamTag","name":"13-el8"},"generation":1,"importPolicy":{"importMode":"Legacy"},"name":"latest","referencePolicy":{"type":"Local"}}]},"status":{"dockerImageRepository":"","tags":[{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhscl/postgresql-10-rhel7@sha256:6b675eb604c9d1232893946585f6b8e4826c8f85f5176e1071b491c56b04f031","generation":2,"image":"sha256:6b675eb604c9d1232893946585f6b8e4826c8f85f5176e1071b491c56b04f031"}],"tag":"10"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhscl/postgresql-10-rhel7@sha256:6b675eb604c9d1232893946585f6b8e4826c8f85f5176e1071b491c56b04f031","generation":2,"image":"sha256:6b675eb604c9d1232893946585f6b8e4826c8f85f5176e1071b491c56b04f031"}],"tag":"10-el7"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhel8/postgresql-10@sha256:d8cb073b1468188422711f877462777ca2c99779af2e5f693b257b2ecdc946c9","generation":2,"image":"sha256:d8cb073b1468188422711f877462777ca2c99779af2e5f693b257b2ecdc946c9"}],"tag":"10-el8"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhscl/postgresql-12-rhel7@sha256:faf802b678b7c39c16e7c16739535dacbf917cd4cda21a900cb4e6a13ab88e1e","generation":2,"image":"sha256:faf802b678b7c39c16e7c16739535dacbf917cd4cda21a900cb4e6a13ab88e1e"}],"tag":"12"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhscl/postgresql-12-rhel7@sha256:faf802b678b7c39c16e7c16739535dacbf917cd4cda21a900cb4e6a13ab88e1e","generation":2,"image":"sha256:faf802b678b7c39c16e7c16739535dacbf917cd4cda21a900cb4e6a13ab88e1e"}],"tag":"12-el7"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhel8/postgresql-12@sha256:00760eb8028eb38b66aa6defbd68e2a008dd0422badc59c989bfc4a587210751","generation":2,"image":"sha256:00760eb8028eb38b66aa6defbd68e2a008dd0422badc59c989bfc4a587210751"}],"tag":"12-el8"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhscl/postgresql-13-rhel7@sha256:975215bebae08c10ebdc7d93b326655a3fe8acd016ce078da599c2263a9237e5","generation":2,"image":"sha256:975215bebae08c10ebdc7d93b326655a3fe8acd016ce078da599c2263a9237e5"}],"tag":"13-el7"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhel8/postgresql-13@sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5","generation":2,"image":"sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5"}],"tag":"13-el8"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhel9/postgresql-13@sha256:658b1fa4c03f3dc0bdf00d2abd9b9fc58ff1af82540111fa2b539d13539fbcbe","generation":2,"image":"sha256:658b1fa4c03f3dc0bdf00d2abd9b9fc58ff1af82540111fa2b539d13539fbcbe"}],"tag":"13-el9"},{"items":[{"created":"2023-08-25T22:55:02Z","dockerImageReference":"registry.redhat.io/rhel8/postgresql-13@sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5","generation":2,"image":"sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5"}],"tag":"latest"}]}} openshift.io/display-name: PostgreSQL openshift.io/image.dockerRepositoryCheck: "2023-09-14T17:18:57Z" samples.operator.openshift.io/version: 4.13.9 creationTimestamp: "2023-09-14T17:18:35Z" generation: 2 labels: samples.operator.openshift.io/managed: "true" name: postgresql namespace: isim-dev resourceVersion: "16266870" uid: 8e2ce89b-a6d7-44c6-a0db-7ffa7ecf6ebc spec: lookupPolicy: local: false tags: - annotations: description: Provides a PostgreSQL 10 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL (Ephemeral) 10 openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql,hidden version: "10" from: kind: DockerImage name: registry.redhat.io/rhscl/postgresql-10-rhel7:latest generation: 2 importPolicy: importMode: Legacy name: "10" referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 10 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL 10 (RHEL 7) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql version: "10" from: kind: DockerImage name: registry.redhat.io/rhscl/postgresql-10-rhel7:latest generation: 2 importPolicy: importMode: Legacy name: 10-el7 referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 10 database on RHEL 8. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL 10 (RHEL 8) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql version: "10" from: kind: DockerImage name: registry.redhat.io/rhel8/postgresql-10:latest generation: 2 importPolicy: importMode: Legacy name: 10-el8 referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 12 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL (Ephemeral) 12 openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql,hidden version: "12" from: kind: DockerImage name: registry.redhat.io/rhscl/postgresql-12-rhel7:latest generation: 2 importPolicy: importMode: Legacy name: "12" referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 12 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL 12 (RHEL 7) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql version: "12" from: kind: DockerImage name: registry.redhat.io/rhscl/postgresql-12-rhel7:latest generation: 2 importPolicy: importMode: Legacy name: 12-el7 referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 12 database on RHEL 8. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL 12 (RHEL 8) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql version: "12" from: kind: DockerImage name: registry.redhat.io/rhel8/postgresql-12:latest generation: 2 importPolicy: importMode: Legacy name: 12-el8 referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 13 database on RHEL 7. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL 13 (RHEL 7) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql version: "13" from: kind: DockerImage name: registry.redhat.io/rhscl/postgresql-13-rhel7:latest generation: 2 importPolicy: importMode: Legacy name: 13-el7 referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 13 database on RHEL 8. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL 13 (RHEL 8) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql version: "13" from: kind: DockerImage name: registry.redhat.io/rhel8/postgresql-13:latest generation: 2 importPolicy: importMode: Legacy name: 13-el8 referencePolicy: type: Local - annotations: description: Provides a PostgreSQL 13 database on RHEL 9. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL 13 (RHEL 9) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql version: "13" from: kind: DockerImage name: registry.redhat.io/rhel9/postgresql-13:latest generation: 2 importPolicy: importMode: Legacy name: 13-el9 referencePolicy: type: Local - annotations: description: |- Provides a PostgreSQL database on RHEL. For more information about using this database image, including OpenShift considerations, see https://github.com/sclorg/postgresql-container/blob/master/README.md. WARNING: By selecting this tag, your application will automatically update to use the latest version of PostgreSQL available on OpenShift, including major version updates. iconClass: icon-postgresql openshift.io/display-name: PostgreSQL (Latest) openshift.io/provider-display-name: Red Hat, Inc. tags: database,postgresql from: kind: ImageStreamTag name: 13-el8 generation: 1 importPolicy: importMode: Legacy name: latest referencePolicy: type: Local status: dockerImageRepository: "" tags: - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhscl/postgresql-10-rhel7@sha256:9e1c9c22d84a95622edb84a8eb870267d126416ec03518e20894759f68bc9dae generation: 2 image: sha256:9e1c9c22d84a95622edb84a8eb870267d126416ec03518e20894759f68bc9dae tag: "10" - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhscl/postgresql-10-rhel7@sha256:9e1c9c22d84a95622edb84a8eb870267d126416ec03518e20894759f68bc9dae generation: 2 image: sha256:9e1c9c22d84a95622edb84a8eb870267d126416ec03518e20894759f68bc9dae tag: 10-el7 - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhel8/postgresql-10@sha256:d8cb073b1468188422711f877462777ca2c99779af2e5f693b257b2ecdc946c9 generation: 2 image: sha256:d8cb073b1468188422711f877462777ca2c99779af2e5f693b257b2ecdc946c9 tag: 10-el8 - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhscl/postgresql-12-rhel7@sha256:1fb56a2e5d37f77d932d010196367610519fde574761165b2deb82205c01d218 generation: 2 image: sha256:1fb56a2e5d37f77d932d010196367610519fde574761165b2deb82205c01d218 tag: "12" - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhscl/postgresql-12-rhel7@sha256:1fb56a2e5d37f77d932d010196367610519fde574761165b2deb82205c01d218 generation: 2 image: sha256:1fb56a2e5d37f77d932d010196367610519fde574761165b2deb82205c01d218 tag: 12-el7 - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhel8/postgresql-12@sha256:00760eb8028eb38b66aa6defbd68e2a008dd0422badc59c989bfc4a587210751 generation: 2 image: sha256:00760eb8028eb38b66aa6defbd68e2a008dd0422badc59c989bfc4a587210751 tag: 12-el8 - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhscl/postgresql-13-rhel7@sha256:e6f327f379a4846e4bbb8fda2f7cb575f5620d2991311677702b76829154db56 generation: 2 image: sha256:e6f327f379a4846e4bbb8fda2f7cb575f5620d2991311677702b76829154db56 tag: 13-el7 - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhel8/postgresql-13@sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5 generation: 2 image: sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5 tag: 13-el8 - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhel9/postgresql-13@sha256:cb66b03ce8f01094dabe250c7b117e075e7e19a85b879721a4ab84b29c149d23 generation: 2 image: sha256:cb66b03ce8f01094dabe250c7b117e075e7e19a85b879721a4ab84b29c149d23 tag: 13-el9 - items: - created: "2023-09-14T17:18:57Z" dockerImageReference: registry.redhat.io/rhel8/postgresql-13@sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5 generation: 2 image: sha256:12b1d5a86864d21d6594384edfe5cccc94205dbf689fc78459797611037060a5 tag: latest ```
ihcsim commented 9 months ago

Does anyone have any thoughts on this? Let me know if there is any other info I can provide. If this isn't the right forum for this question, let me know where is a better place for it. Thanks.

kaovilai commented 9 months ago

This is right forum. Thanks!

kaovilai commented 9 months ago

We will be attempting to reproduce this.

ihcsim commented 9 months ago

@kaovilai any luck with reproducing this?

kaovilai commented 9 months ago

We had some discussions around what could be happening but afaict no one reproduced it yet.

Are you able to provide us a reproducing case? Or is this happening one of somewhere?

ihcsim commented 9 months ago

Other than the provided ImageStream and Restore YAML, let me know what other specific configuration you need to reproduce the issue. We are using OADP 1.1.3 and the Velero that comes with it. It's not an isolated incident; we see this issue both in our lab and in our user's environment.

kaovilai commented 9 months ago

We would like a step to reproduce from a clean new cluster if possible.

As provided, we only know what's being restored, don't know what's in the cluster during backup or prior to restore.

ihcsim commented 9 months ago

@kaovilai I am able to reproduce this issue on CRC 2.27 with OADP 1.1.6, by setting the DataProtectionApplication resource to not backup images. I posted the Velero logs in this gist (search for "resource restore error").

Is it not possible to restore image stream and its image stream tags without backing up images, even though the image stream only references a public image, like the one here?

Also, I am under the impression that backup images only work with S3. Is that true? In our case, we aren't using S3.

Version info:

➜ crc version
CRC version: 2.27.0+71615e
OpenShift version: 4.13.12
Podman version: 4.4.4

➜ k -n openshift-adp get csv
NAME                   DISPLAY         VERSION   REPLACES               PHASE
oadp-operator.v1.1.6   OADP Operator   1.1.6     oadp-operator.v1.1.5   Succeeded

DataProtectionApplication YAML:

apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
     <snipped>
  creationTimestamp: "2023-10-04T15:49:35Z"
  generation: 3
  name: velero
  namespace: openshift-adp
  resourceVersion: "415206"
  uid: e123e94a-6d1b-4e1a-ac49-efb1ef0ab1fd
spec:
  backupImages: false
  backupLocations:
  - velero:
      config:
        profile: default
        region: ***
      credential:
        key: cloud
        name: cloud-credentials
      default: true
      objectStorage:
        bucket: ***
        prefix: ***
      provider: aws
  configuration:
    restic:
      enable: false
    velero:
      defaultPlugins:
      - openshift
      - aws
status:
  conditions:
  - lastTransitionTime: "2023-10-04T15:49:35Z"
    message: Reconcile complete
    reason: Complete
    status: "True"
    type: Reconciled

Velero backup and restore YAML:

apiVersion: velero.io/v1
kind: Backup
metadata:
  annotations:
    velero.io/source-cluster-k8s-gitversion: v1.26.7+0ef5eae
    velero.io/source-cluster-k8s-major-version: "1"
    velero.io/source-cluster-k8s-minor-version: "26"
  creationTimestamp: "2023-10-05T19:44:36Z"
  generateName: backup-
  generation: 5
  labels:
    velero.io/storage-location: velero-1
  name: backup-cnc7h
  namespace: openshift-adp
  resourceVersion: "416565"
  uid: 3aa8c6a1-3231-42ea-8531-4437d93fa445
spec:
  csiSnapshotTimeout: 10m0s
  defaultVolumesToRestic: false
  excludedResources: []
  hooks: {}
  includedNamespaces:
  - isim-dev
  includedResources: []
  storageLocation: velero-1
  ttl: 720h0m0s
status:
  completionTimestamp: "2023-10-05T19:44:59Z"
  expiration: "2023-11-04T19:44:37Z"
  formatVersion: 1.1.0
  phase: Completed
  progress:
    itemsBackedUp: 45
    totalItems: 45
  startTimestamp: "2023-10-05T19:44:37Z"
  version: 1
---
apiVersion: velero.io/v1
kind: Restore
metadata:
  creationTimestamp: "2023-10-05T19:47:35Z"
  generateName: restore-
  generation: 5
  name: restore-4d6bn
  namespace: openshift-adp
  resourceVersion: "417228"
  uid: 79eed167-70a1-4a0e-88dd-a571ca589a34
spec:
  backupName: backup-cnc7h
  excludedResources:
  - persistentvolumeclaims
  - persistentvolumes
  - services
  - nodes
  - events
  - events.events.k8s.io
  - backups.velero.io
  - restores.velero.io
  - resticrepositories.velero.io
  - csinodes.storage.k8s.io
  - volumeattachments.storage.k8s.io
  includeClusterResources: false
  includedResources: []
  namespaceMapping:
    isim-dev: isim-dev-restored
  restorePVs: false
status:
  completionTimestamp: "2023-10-05T19:47:37Z"
  errors: 12
  phase: PartiallyFailed
  progress:
    itemsRestored: 45
    totalItems: 45
  startTimestamp: "2023-10-05T19:47:35Z"
  warnings: 6
sseago commented 9 months ago

@ihcsim "Is it not possible to restore image stream and its image stream tags without backing up images, even though the image stream only references a public image, like the one here?"

The image copy functionality only copies tags for which the dockerImageReference points to the internal registry. In the example linked, that's an external image -- this is backed up and restored not by copying image bits but by backing up and restoring the kubernetes ImageStreamTag resource. @kaovilai That should be unaffected by the backupimages flag, right? Since it's just a kube item restore, and we're not using the backup registry at all for it?

kaovilai commented 9 months ago

Right. Turning off backupImages stops imagestream images copy logic but don't prevent kube resource from being included as a kube resource in backup

kaovilai commented 9 months ago

I am under the impression that backup images only work with S3. Is that true?

It should work for some configuration of gcp and azure as well however s3 is most tested configuration.

Still, having backupImages false should mean the imagestream functions don't do any internal image copying. Should be the same as if openshift velero plugin isn't added..

Can you try if you can repeat the issue without openshift plugin?

ihcsim commented 9 months ago

this is backed up and restored not by copying image bits but by backing up and restoring the kubernetes ImageStreamTag resource.

Yes, this is what I'd expect, but it's not what I am seeing.

Can you try if you can repeat the issue without openshift plugin?

As in just plain Velero? How does that help with my original issue, which involves OADP?

ihcsim commented 9 months ago

The image copy functionality only copies tags for which the dockerImageReference points to the internal registry.

FWIW, I notice that in my example image stream, the status.dockerImageRepository was updated (by something?) to point to the internal registry, even though the spec references an external image.

kaovilai commented 9 months ago

Even with oadp installing velero you should be able to remove openshift from dpa

sseago commented 9 months ago

@kaovilai without openshift plugin won't be exactly the same. In that case we'll attempt to restore kube resources for internal images as well, which may result in imagestreamtags created for images that don't actually exist.

sseago commented 9 months ago

@ihcsim status.dockerImageRepository always points to the internal registry, even for external images. The field that points to the actual image location is the dockerImageReference -- that's the one that will be external for the image in question.

kaovilai commented 9 months ago

Root cause analysis:

// Get retrieves an image that has been tagged by stream and tag. `id` is of the format <stream name>:<tag>.
func (r *REST) Get(ctx context.Context, id string, options *metav1.GetOptions) (runtime.Object, error) {
    name, tag, err := nameAndTag(id)
    if err != nil {
        return nil, err
    }

    imageStream, err := r.imageStreamRegistry.GetImageStream(ctx, name, options)
    if err != nil {
        return nil, err
    }

    image, err := r.imageFor(ctx, tag, imageStream) // <-- this is returning an IsNotFoundError
    if err != nil {
        return nil, err
    }

    return newISTag(tag, imageStream, image, false)
}

https://github.com/openshift/openshift-apiserver/blob/9573998170f3bb7ae7e946c11b7e9fc414120df4/pkg/image/apiserver/registry/imagestreamtag/rest.go#L127C1-L145C2

which calls

// imageFor retrieves the most recent image for a tag in a given imageStreem.
func (r *REST) imageFor(ctx context.Context, tag string, imageStream *imageapi.ImageStream) (*imageapi.Image, error) {
    event := internalimageutil.LatestTaggedImage(imageStream, tag)
    if event == nil || len(event.Image) == 0 {
        return nil, kapierrors.NewNotFound(imagegroup.Resource("imagestreamtags"), imageutil.JoinImageStreamTag(imageStream.Name, tag)) // <-- this is where error originated
    }

    return r.imageRegistry.GetImage(ctx, event.Image, &metav1.GetOptions{})
}

https://github.com/openshift/openshift-apiserver/blob/9573998170f3bb7ae7e946c11b7e9fc414120df4/pkg/image/apiserver/registry/imagestreamtag/rest.go#L439C1-L447C2

which is caused by TagEvent not yet existing due to network speeds etc in processing the imagestreamtag creation

// LatestTaggedImage returns the most recent TagEvent for the specified image
// repository and tag. Will resolve lookups for the empty tag. Returns nil
// if tag isn't present in stream.status.tags.
kaovilai commented 9 months ago

Should be resolved by https://github.com/vmware-tanzu/velero/pull/6949

ihcsim commented 8 months ago

@kaovilai The changes in your PR fixes the issue. For some larger images with multiple tags (such as the OpenShift Postgres sample), we had to bump the retry cap to between 3 to 5 minutes, on different clusters. I don't know if we want to make this a configurable parameter, or if we can just pick a higher cap like 5 mins and then document this issue in the README. LMKWYT.

kaovilai commented 8 months ago

Ack. Depending on if velero is receptive we might still have to make a hack in openshift plugin

sseago commented 8 months ago

@kaovilai If we do it in the plugin, I guess we won't need to worry about retries or waiting. If the "dry run" create fails with AlreadyExists, we just return with discarding the istag.

openshift-bot commented 5 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kaovilai commented 5 months ago

Closed via https://github.com/vmware-tanzu/velero/pull/7004