zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.29k stars 974 forks source link

Add enable_pod_antiaffinity to kind: postgresql #1147

Open alexey-gavrilov-flant opened 4 years ago

alexey-gavrilov-flant commented 4 years ago

enable_pod_antiaffinity can only be enabled as a global parameter. which leads to a massive immediate roll-over of all environments. which is very dangerous. could you add this parameter to kind: postgresql. It seems to me not very good practice to create parameters in the operator, it is better to transfer everything to kind: postgresql.

mboutet commented 4 years ago

I'd also add the pod_antiaffinity_topology_key to that.

FxKu commented 4 years ago

@alexey-gavrilov-flant @mboutet do you really need an individual settings about affinities for each cluster or do you just fear the fail over in all clusters? One idea could be to run a second operator instance with different configuration and use the ownership annotation for the clusters where you want to change the behavior. Our motivation was to keep the Postgres manifest as simple as possible. Sometimes I get that individual settings are desired but in this case here I think a gloabal setting is sufficient. I can be wrong though.

mboutet commented 4 years ago

In my case it's more to provide the possibility to set these parameters on a per-cluster basis. However, I think that this is also a good idea to prevent the herd switchover of all the clusters.

My use-case is that I use Terraform to provision my k8s cluster as well as all the core services such as Prometheus, postgres-operator, cert-manager, etc. Then, I deploy my app stack on this cluster using helm which include a postgresql cluster. For a deployment in a production namespace, I want the postgresql cluster to have a topology key set to topology.kubernetes.io/zone so that the postgresql nodes are in different failure domains. However, if I deploy my stack in a non-prod namespace in the same cluster, I do not necessarily need the postgresql nodes to be in different failure domains (thus reducing overall cost). I hope this makes sense.

From a manifest standpoint, I personally don't think that giving the possibility to override these settings in the postgresql manifest would hurt the simplicity since these would be optional.

alexey-gavrilov-flant commented 4 years ago

we use git to link psql installations to the application that uses it. but we need to link the application with a specific group of nodes and psql. and then roll it out as a release. since it is too expensive for us to coordinate the rollout at the same time for all applications, it would be easier to apply the settings individually for each installation, and not for the entire cluster.

alexey-gavrilov-flant commented 4 years ago

the two operators look like a crutch. showing that there are cases when it is impossible to do with one operator with the necessary settings. this is handy for safely updating operator or when updating an api.