ibm-messaging / mq-helm

Apache License 2.0
25 stars 33 forks source link

readynessProbe based failover has disadvantages #42

Closed chughts closed 1 year ago

chughts commented 1 year ago

Question raised on MQ Community Forum

https://community.ibm.com/community/user/integration/discussion/ibm-mq-nativeha-on-kubernetes-with-ibm-messagingmq-helm-chart#bm3f629af2-b3d0-480b-890e-ef05767464f9

callumpjackson commented 1 year ago

Answer provided within community, pasted here for reference: "As you say, IBM MQ uses the approach of having only one of the three Pods pass their readiness probe. There are some disadvantages to the current approach, as you've noticed, where some tools equate "readiness" with success. We did consider other alternatives, however they seemed to offer more disadvantages, for example:

Have all instances be "ready", and for any clients or queue managers connecting try each of the instances in turn. This would (in theory) be handled automatically by an MQ client application inside the same Kubernetes cluster, but could take three TCP/IP connections (and TLS handshakes) to find the active instance. The clients would obviously need all three addresses. Connections from outside the cluster would generally need to use a single address (e.g. of a Router or load balancer), and have application logic to retry up to three times. This should theoretically work in the MQ software product, but is not explicitly tested for Native HA. Have a custom MQ-aware router running in another Pod, which could route traffic to the active instance. This would require a new component, and would certainly complicated Native HA deployments. The new component would need to be HA itself, so potentially two stateless Pods for the routing, plus three queue manager Pods. Each option has downsides, but the current solution seemed the best fit. I don't think ArgoCD should be equating readiness with success, so would perhaps argue that they could improve here.

Rolling updates are discussed in Considerations for performing your own rolling update of a Native HA queue manager. In particular, any rolling update which has no regard for which Pod is the elected leader is not desirable for Native HA, as it has a 66% chance of resulting in an extended outage (two fail-overs). e.g. if the leader is the first or second Pod to be updated, then the new leader will be updated very soon. You could even get very unlucky and require three fail-overs. For this reason, the MQ Operator implemented custom logic to perform the rolling update. This is a complex piece of code, and is closely tied to both how operators work, and to the exact method of deployment (e.g. StatefulSet of three replicas), so not a generic tool.

There aren't any plans to change either of these areas at the moment. If you feel strongly that the down-sides mentioned would work better in your environment, you could submit a feature request at https://ibm.biz/mqideas"