Coil egress has downtime due to the timing of updating coild and coil controller

cybozu-go / coil

CNI plugin for Kubernetes designed for scalability and extensibility

Apache License 2.0

165 stars 20 forks source link

Coil egress has downtime due to the timing of updating coild and coil controller #244

Closed terassyi closed 1 year ago

terassyi commented 1 year ago

When we update Coil, we have downtime in the coil-egress due to the timing of updating coild and coil controller.

What

When we update coil controller, deployment resources tied to Egress restart due to updating the image. At this time, the Pod changes to the terminating state. But, when coild hasn't started from the restart, kubelet can't delete the Pod, and kubelet has to wait to start coild.

So, even though the container process of the Pod is deleted, Cilium still recognizes that the Pod can handle the existing traffic in this case. This causes downtime on coil egress.

How

We have to control the order of the update of coild and coil controller; the first is coild, and the second is coil controller.

TODO

We survey whether we can use CKE's user-defined resources.
We fix the coil-controller to wait for the restart of coild.

ymmt2005 commented 1 year ago

@zoetrope @terassyi How did you resolve it? I thought we could assign a common label to coild and coil-controller and have a PodDisruptionBudget for both Pods.

terassyi commented 1 year ago

@ymmt2005 We resolved this problem by fixing CKE.

With this fix, CKE can apply user-defined resources depending on its rank. And we configured appropriate ranks for coild and coil-controller. Please see below for detail.

ymmt2005 commented 1 year ago

@terassyi I believe this should be resolved in the Coil manifest. As Coil is an independent product from CKE.

ymmt2005 commented 1 year ago

I might have misunderstood the problem. It sounds like a bad relationship between coil-egress Pod and the coild on the same node, right?

If so, I have a question. Why Cilium routes packets to the coil-egress Pod even though it is in the TERMINATING state?

ysksuzuki commented 1 year ago

@ymmt2005

I believe this should be resolved in the Coil manifest. As Coil is an independent product from CKE.

I don't think so. This problem is caused by the dependency from coil-egress to coild. (coil-egress pod can't be removed completely without coild(CNI)) This kind of dependency issue is generic, and not coil specific in my opinion.

ysksuzuki commented 1 year ago

If so, I have a question. Why Cilium routes packets to the coil-egress Pod even though it is in the TERMINATING state?

Cilium redirects packets to the terminating backends for the existing service connections until the backends are finally removed. https://isovalent.com/blog/post/2021-12-release-111/#graceful-termination

For the new connections, Cilium selects new backend, however, it always selects the terminating backend in this case. Because the 5 tuple between a client and a coil-egress is always the same. (UDP, dest, 5555, source, 5555). So Cilium doesn't have a chance to pick a new backend, and sends packets to the terminating backend.

ymmt2005 commented 1 year ago

@ysksuzuki Got it, if it's a generic problem that should be resolved elsewhere, I guess we should add some guide to the documentation.