reactive-tech / kubegres

Kubegres is a Kubernetes operator allowing to deploy one or many clusters of PostgreSql instances and manage databases replication, failover and backup.
https://www.kubegres.io
Apache License 2.0
1.32k stars 74 forks source link

strategy to deal with cluster upgrade in GCP (rewind to use formal primary as replica again) #63

Closed joeywang closed 3 years ago

joeywang commented 3 years ago

Dear Alex, thanks a lot for you works on kubegres. We are very happy to use this operator in our dev env.

Recently we migrate our PostgreSQL13 to k8s with kubegres. It's easy and straight-forward.

Google really push forward to upgrade the nodes to up-to-date ones. So you can expect the upgrades happen every a few months. So here is what happened:

  1. Primrty db1 terminated to migrate to new node
  2. Replica db2 was promoted to primary.
  3. Replica db3 created as standby for db2(primary now).
  4. Time for replica db2 to migrate.
  5. db3 was promoted to primary
  6. db4 created as standby for db3.

It's very good to have failover with no disruption to the customers. But it's just that the swithing happened twice.

So my thoughts on this:

  1. pg_switch to reuse old primary
  2. switch off failover when node upgrading

Please feel free to give any suggestions on this.

Thank you.

alex-arica commented 3 years ago

Thank you for your message.

For that type of situation when there is a maintenance of a Kubernetes cluster, we use what you suggested: 2. switch off failover when node upgrading.

It is a manual process, so you have to disable the failover feature for your cluster of Postgres. Please see the field failover in this page which explains how to do so: https://www.kubegres.io/doc/properties-explained.html

Please let me know if that helps.

alex-arica commented 3 years ago

I am closing this issue. If you have any question, you can add a message here.