Catalog Source Pod is not recreated when transitioned to the terminated state

Bug Report

What did you do?

Have a catalog source.
catalog source generates a pod on node A.
node A get restarted/replaced.
Pod not replaced

What did you expect to see? Pod is recreated.

What did you see instead? Under which circumstances? Dead pod not replaced. But deleting the dead pods manually will trigger recreate.

Environment

operator-lifecycle-manager version:

OLM version: v0.20.0
git commit: e6428a19b52d2fd7e689577d7be55223b1b2e5f8

Kubernetes version information:

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:58:47Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.6-gke.1500", GitCommit:"5595443086b60d8c5c62342fadc2d4fda9c793e8", GitTreeState:"clean", BuildDate:"2022-02-09T09:25:03Z", GoVersion:"go1.16.12b7", Compiler:"gc", Platform:"linux/amd64"}

Kubernetes cluster kind: GKE

Possible Solution

Check for this pod condition and replace it.

One of the comment in https://github.com/operator-framework/operator-lifecycle-manager/issues/2666 suggests to make CatalogSource Pod controller back (deployment/ss), that should also resolve this.

Additional context

Pod status

``` yaml status: conditions: - lastProbeTime: null lastTransitionTime: "2022-03-24T18:48:24Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2022-03-24T21:02:39Z" message: 'containers with unready status: [registry-server]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2022-03-24T21:02:39Z" message: 'containers with unready status: [registry-server]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2022-03-24T18:48:24Z" status: "True" type: PodScheduled containerStatuses: - containerID: containerd://d327b53da13df65232bfaa19120022a6a96af630350ac43fb51a17b15f0c55bb image: quay.io/operatorhubio/catalog:latest imageID: quay.io/operatorhubio/catalog@sha256:009ba4d793616312c7a847dd4a64455971b2d7d68a5d2a16e76d6df3ce03eedc lastState: {} name: registry-server ready: false restartCount: 0 started: false state: terminated: containerID: containerd://d327b53da13df65232bfaa19120022a6a96af630350ac43fb51a17b15f0c55bb exitCode: 0 finishedAt: "2022-03-24T21:02:38Z" reason: Completed startedAt: "2022-03-24T18:48:28Z" hostIP: 10.100.4.19 message: Pod was terminated in response to imminent node shutdown. phase: Failed podIP: 10.100.18.45 podIPs: - ip: 10.100.18.45 qosClass: Burstable reason: Terminated startTime: "2022-03-24T18:48:24Z"``` ```

operator-framework / operator-lifecycle-manager

Catalog Source Pod is not recreated when transitioned to the terminated state #2709

Bug Report