CrunchyData / postgres-operator

Production PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.
https://access.crunchydata.com/documentation/postgres-operator/v5/
Apache License 2.0
3.96k stars 593 forks source link

Running switchover before leader shutdown #4024

Open dooman87 opened 3 weeks ago

dooman87 commented 3 weeks ago

Have an idea to improve PGO? We'd love to hear it! We're going to need some information from you to learn more about your feature requests.

Please be sure you've done the following:

Overview

When the pod with a leader is terminating - run patroni switchover to minimize downtime.

Use Case

Operator version: 5.6

We are currently deploying our PostgresCluster with two replicas. Every replica is running on its own node. Our EKS cluster ran by the other team and they need to update nodes which is causing pods eviction and re-creation. When leader pod is terminated, then patroni will schedule failover and most of the time it will cause 2-4 seconds downtime, but sometime that downtime could take up to 20 seconds.

Desired Behavior

We are currently developing a sidecar that can run a preStop hook that will execute switchover if it's a leader.

Could that be a default behavior and preStop hook included into postgres container?

Environment

Tell us about your environment:

Please provide the following details:

Additional Information

Would be curious to know if that's been considered before and the approach has any disadvantages?