Open sabinematthys opened 3 years ago
Additional comment. Yes, there are cordon, and drain, operations.
kubectl cordon my-node # Mark my-node as unschedulable kubectl drain my-node # Drain my-node in preparation for maintenance Link.
I noticed the same: when a k8s node is cordoned, then pod gets moved to an other k8s node, in the DB cluster. --> It is incorrect behaviour. We may cordon healthy k8s nodes. --> Pods should not be moved to other k8s nodes in such situation. I would like to keep DB pods on same k8s nodes as long as possible --> Because of stability reasons of the DB cluster. "Drain" is meant for moving pods away from a k8s node.
Indeed the cordon is not the only operation that can lead to db failover (any operation which leads to not "Ready" status triggers it. But in the case of the drain, I think even without postgres-operator action it would failover as the pod is also removed. It is only in the case of cordon that it is annoying for us.
@sabinematthys, podAntiAffinity https://github.com/zalando/postgres-operator/blob/v1.6.1/docs/administrator.md#enable-pod-anti-affinity may help you spread your Pods.
But I kind of agree with you, we may sometimes cordon a "healthy" node for various eligible reasons and we don't want the node being Unschedulable https://github.com/zalando/postgres-operator/blob/v1.6.1/pkg/controller/node.go#L79 driving the operator to move Pods from it.
cordon case, (of a healthy k8s node or even multiple k8s nodes):
Is it ok, if the behaviour is changed or made configurable? pod should not be moved in such case.
https://github.com/zalando/postgres-operator/blob/v1.6.1/pkg/controller/node.go#L74 https://github.com/zalando/postgres-operator/blob/v1.6.1/pkg/controller/node.go#L79
Or is there a reason for the functionality? (Drain related?)
To who are you asking the question? I suppose not to me as it was indeed the initial request I did, but good for me.
I just realized: If only one k8s node is cordoned, then only the Leader role is transferred to an other pod/db node. Pods are not moved in such case.
A pod would be moved, if e.g. all k8s nodes of all db nodes are cordoned.
This still seems to be an issue, the only solution we found at the time (thanks @machine424) was to forbid the operator to get/list/watch nodes, without issues, despite annoying error logs.
The solution listed at https://github.com/zalando/postgres-operator/issues/2277#issuecomment-1507039074 (thanks @FxKu) does not seem to be generic enough for all use cases (at least, for our use case, where we don't have specific readiness labels).
With spilo 13 patroni 2.0 and postgres-operator 1.6.0 I noticed that when doing a cordon of master it triggers a db failover Before cordon we have
then I do
kubectl cordon $(kubectl get pods -l application=spilo -l spilo-role=master -o=jsonpath='{.items[0].spec.nodeName}')
And after I see:just after:
a bit after:
after stabilisation:
So we see that just with a cordon the leader is on ip1 while it was on ip2 before. This behavior affects our testing strategy (involving cordon of nodes to schedule pods on other nodes than master for instance) that was developed with postgres-operator 1.5.0, spilo 12 and patroni 1.6.5 also we are not sure if it could have impact on upgrade of our application. Is there a way to deactivate the behavior?
Postgres-operator pod log
master pod log (another test, for better log follow up, this time db-cluster-1 was master before cordon):
It seems to be a new feature (between 1.5 and 1.6) which I don't really understand as un-schedulable means new pods cannot be scheduled, why does it mean status of current ones has to be changed ? In which case is that behavior interesting? (just to understand why it was implemented) In our case, we use the cordon to prevent pods to be scheduled on master (for tests mainly), but that is just one among several ways we use cordon on master node, how can we do that if master switches from node as soon as it is cordoned? So is there a way to disable that feature (we didn't have that problem in previous version)?