openshift / oadp-operator

OADP Operator
Apache License 2.0
78 stars 72 forks source link

Mongo DeploymentConfig Restic Imagestream flakes #925

Closed kaovilai closed 1 year ago

kaovilai commented 1 year ago
  1. Application YAML deployed

  2. BuildConfigs triggered, builds, and push image to ImageStreams

    kind: ImageStreamTag
              name: "todolist-mongo-go:latest"
    1. image output: {&BuildStatusOutputTo{ImageDigest:sha256:5bf85296fbc9ab386a0e1cc153002db38637e23936a2593c261f72e236cbd0a7,}}
  3. DeploymentConfig redeploy using image generated from new build image-registry.openshift-image-registry.svc:5000/mongo-persistent/todolist-mongo-go@sha256:5bf85296fbc9ab386a0e1cc153002db38637e23936a2593c261f72e236cbd0a7

  4. Pod is running

  5. Backup successfully with imagestreamtag backedup

    1. level=info msg="[istag-backup] Backing up imagestreamtag todolist-mongo-go:latest"
  6. Destroy Namespace (everything gone, including imagestreams

  7. Restore completed successfully

    1. ImageStream is also restored
    2. [istag-restore] Restoring imagestreamtag todolist-mongo-go:latest
    3. [istag-restore] Local image: image-registry.openshift-image-registry.svc:5000/mongo-persistent/todolist-mongo-go@sha256:5bf85296fbc9ab386a0e1cc153002db38637e23936a2593c261f72e236cbd0a7
      1. from imagestream restore image
  8. BuildConfig that got restored triggered another build

  9. Build pushes a new todolist-mongo-go:latest image overriding previous restored :latest tag. However as seen in the next few steps the first restored SHAs should still be pullable.

    1. From restored build push image
  10. DeploymentConfig create a new pod based on ConfigChange trigger due to ImageStreamTag update

  11. Disconnected pod will have sha from build in original backup which should be pullable.

  12. Expected: all pods are running

  13. Actual: sometimes the disconnected pod would have ImagePullBackOff related issues due to registry Auth errors

  14. oc registry login && docker pull did verify that imagestreamtag sha can be pulled

kaovilai commented 1 year ago

Tried recreating the disconnected pod but with the pod's .spec.imagePullSecrets removed and the pod was able to get to running status. Perhaps another thing for openshift-velero-plugin to implement.

kaovilai commented 1 year ago

OADP-1.1 issue https://issues.redhat.com/browse/OADP-1487