CrunchyData / postgres-operator

Production PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.
https://access.crunchydata.com/documentation/postgres-operator/v5/
Apache License 2.0
3.96k stars 593 forks source link

Pod antiAffinity not supported #1062

Closed moshloop closed 4 years ago

moshloop commented 5 years ago

The current implementation creates a new deployment per replica, this has a side-effect of not being able to leverage anti-affinity rules to ensure pod's are not scheduled on the same host / zone.

Either the master and replicas should be implemented using a single statefulset or antiAffinity across multiple deployments should be (re)implemented at the operator leve.

jkatz commented 5 years ago

Hi @moshloop -- thanks for the feedback! We have plans to support pod anti-affinity rules in the next release along with some other changes to how we handle PostgreSQL clusters. Please stay tuned. Thanks!

moshloop commented 5 years ago

If you are planning to add support for this in the next release why close the issue? Surely it should remain open until the code has been committed and the feature is available?

tongpu commented 5 years ago

Another feature that I would deem important for stable production deployments is implementing a PodDisruptionBudget to interact with the eviction API during the draining of a node. I've brought this up in #875 but didn't get feedback back then. I think those would be two important features that would be beneficial for ensuring hands free operations.

jkatz commented 5 years ago

Hi @tongpu,

Thanks for the feedback! This is something we will look into as we update our HA solution. To provide a bit more context, we are moving our PostgreSQL HA towards a distributed-consensus model that follows the Raft distributed consensus algorithm. We will see if setting the PodDisruptionBudget makes sense in that model. We are hoping the overall changes that are coming in our next release will help satisfy what you are looking for "hands free" operation.

Thanks!

lkhomenk commented 5 years ago

one can add pod-affinity\antiaffinity into cluster-deployment. and it even will be merged with nodeaffinity if nodeselecor is given.

jkatz commented 5 years ago

@lkhomenk Thanks for the advice! By chance do you have an example spec file demonstrating this?

We are still planning to have podAntiAffinity (as well as continued support for nodeAffinity) supported natively in our 4.2 release just to ease the configuration burden.

lkhomenk commented 5 years ago

cluster-deployment part with affinity

            "spec": {
              "affinity": {
                "podAntiAffinity": {
                  "requiredDuringSchedulingIgnoredDuringExecution": [
                    {
                      "topologyKey": "kubernetes.io/hostname",
                      "labelSelector": {
                        "matchExpressions": [
                          {
                            "key": "app",
                            "operator": "In",
                            "values": [
                              "{{.ClusterName}}"
                            ]
                          }
                        ]
                      }
                    }
                  ]
                }
              },  
                {{.SecurityContext }}

and that's what we get in container - merged with nodeSelector from affinity.json

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: role.testdomain.com
                operator: In
                values:
                  - db
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
              - key: app
                operator: In
                values:
                  - develop-db
          topologyKey: kubernetes.io/hostname
  containers:

main problem is limited parametrisation of variables which can be used inside cluster-deployment.json. but we added our labels. main thing we use - ClusterName

jkatz commented 4 years ago

Pod Anti-Affinity is now supported in the v4.2.0 release.