zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.25k stars 969 forks source link

[feature request] add possibility to ignore differences in pod image in sync process #2622

Open martin31821 opened 4 months ago

martin31821 commented 4 months ago

Please, answer some short questions which should help us to understand your problem / question better?

We are experiencing the same issue as #1397, #2453 and #1955 and I'd like to propose a fix for it. For us it is company policy to have every docker image we use mirrored in our own private registry, which we do by running a k8s mutating webhook that pulls the docker images, pushes it to our own registry and then swaps out the image reference on pod creation.

Therefore our pods always have a different image set than the StatefulSet, so postgres-operator will kill and recreate all of our clusters every sync interval.

I'd like to propose a new configuration option ignoreImageDifference, which by default should be false to keep the current behavior. If it's set to true, differences in the actual image in the pod vs in the statefulset will be ignored by the sync processor.

Jasper-Ben commented 4 months ago

FWIW we currently work around this by setting enable_lazy_spilo_upgrade=true, thus not triggering a Pod recreate, see: https://github.com/zalando/postgres-operator/blob/5357062857f3b724546a80849c2257ae7b804c14/pkg/cluster/sync.go#L408-L432

FxKu commented 4 months ago

Can you not specify the mirrored image in the global configuration? Not sure if I'm getting it. For delaying pod replacement on image differences you've already found the right option: enable_lazy_spilo_upgrade

Jasper-Ben commented 4 months ago

Can you not specify the mirrored image in the global configuration? Not sure if I'm getting it. For delaying pod replacement on image differences you've already found the right option: enable_lazy_spilo_upgrade

πŸ‘‹ @FxKu, to explain our workflow:

We use https://estahn.github.io/k8s-image-swapper/v1.5/index.html which will automatically mirror new container images of upcoming pods into our own registry using a well-known naming scheme. The next time the same container image is used in an upcoming pod, image-swapper will notice that the image already exists in our registry and transparently replaces the image using a mutating webhook. Thus, we ensure images are always available independently of various upstream registries (in the past we had a case of downtime due to an unfortunate combination of unavailable upstream registry and newly spawned autoscaling instances that did not have the required image cached).

Using the webhook rather than hardcoding the mirror image into the the helm values has the advantage that it solves the availability issue at scale throughout our entire cluster without any extra steps during workload setup.

As an added bonus, since we do IasC we can use renovate bot to easily manage dependencies without having to jump through any hoops due to changed image specifications.

Jasper-Ben commented 4 months ago

For delaying pod replacement on image differences you've already found the right option: enable_lazy_spilo_upgrade

Jep, this works for us but still isn't ideal since we now have to manually roll-out updates. But thinking about it, i'm actually not sure it is possible to have both using the operator πŸ˜… (but not sure)