zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.22k stars 968 forks source link

PersistentVolumeClaim retention is feature-gated and should be handled conditionally #2599

Open bo0ts opened 5 months ago

bo0ts commented 5 months ago

After upgrading to v1.11.0 I noticed the following messages for every postgres cluster on every start up of the operator:

time="2024-04-04T16:56:37Z" level=info msg="statefulset myapp/myapp-postgres is not in the desired state and needs to be updated" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-          terminationMessagePath: /dev/termination-log," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-          terminationMessagePolicy: File," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      restartPolicy: Always," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      dnsPolicy: ClusterFirst," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      serviceAccount: postgres-pod," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      securityContext: {}," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      schedulerName: default-scheduler" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="+      securityContext: {}" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      kind: PersistentVolumeClaim," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      apiVersion: v1," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      status: {" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-        phase: Pending" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-      }" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="+      status: {}" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="-  revisionHistoryLimit: 10" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="+  persistentVolumeClaimRetentionPolicy: {" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="+    whenDeleted: Retain," cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="+    whenScaled: Retain" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="+  }" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=info msg="reason: new statefulset's persistent volume claim retention policy do not match" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="replacing statefulset" cluster-name=myapp/myapp-postgres pkg=cluster
time="2024-04-04T16:56:37Z" level=debug msg="waiting for the statefulset to be deleted" cluster-name=myapp/myapp-postgres pkg=cluster

After that nothing happens. The sts is neither deleted nor fixed and the messages show up on every start of the operator.

The cluster is running version k8s 1.25. It is my understanding that PersistentVolumeClaims are behind a feature-gate even in version 1.27 and that those parts of the sts spec can never be set. Indeed, trying to set them manually fails. See: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#persistentvolumeclaim-retention

The operator should not attempt to set those values if the feature is not available.

haeuserm commented 5 months ago

Observing the same. We are on GKE 1.26 .

hau21um commented 5 months ago

Setting configuration to persistent_volume_claim_retention_policy: {} or even completely removing it has no effect and behaves same way as described in this issue.

angelsantillana94 commented 3 months ago

I have the same problem.

aldelsa commented 3 months ago

Hi, same issue here :(

angelsantillana94 commented 2 months ago

any update?

FxKu commented 1 month ago

Update your K8s folks, c'mon! 1.27 is already old. Or somebody come up with proper PR where operator checks the server version.