Elaborate the possibility of container termination to complete before Endpoints reconciliation

Ziphone commented 1 year ago

This is a Feature Request The section [1] should mention the possibility of all containers being terminated before the Pod has been removed from associated Endpoint resources. As this could cause some failed connections until the Endpoint controller has completed reconciliation.

1 - https://github.com/kubernetes/website/blob/main/content/en/docs/concepts/workloads/pods/pod-lifecycle.md?plain=1#L415

What would you like to be added An elaboration on possible flows, rather than just the example flow. Clarify the potential of container termination to complete before Endpoints reconciliation.

Why is this needed To make Kubernetes users aware of a possible pitfall, causing minor network disruptions.

sftim commented 1 year ago

How I'd tackle this:

update https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination-forced to mention that container failure can lead to forced Pod termination (for example: if the restart policy is Never)
also update https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination-forced to mention that clients might try sending traffic to containers of the failed Pod, even as that terminated Pod is no longer in a position to accept the traffic
- and that node-level traffic forwarding can cease as soon as the Pod is deleted from the API, separately from whether the container runtime has or hasn't shut down the actual containers.

/language en

Ziphone commented 1 year ago

Sounds like a good idea, Tim.

I also think that the content on graceful Pod deletion [1] should reflect these issues, caused by eventual consistency. When performing a regular rolling update, there is also a potential for opening connections to terminating pods - as Kubernetes does not (from my understanding) enforce Endpoint updates to take place (and eventually propagate to service proxy) before Kubelet starts SIGTERM-ing the containers of the terminating Pod. Kubelet and Endpoint controller operate asynchronously, paying no attention to each other. If I'm wrong and an order is enforced , then I don't think it is reflected at the moment (also from [1]):

3. At the same time as the kubelet is starting graceful shutdown... 
... 
Pods that shut down slowly cannot continue to serve traffic as load balancers (like the service proxy) remove the Pod from the list of endpoints as soon as the termination grace period begins.

I've seen multiple articles [2] addressing this pitfall, and think it should be addressed or at least acknowledged by the official documentation.

1 - https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination 2 - https://engineering.rakuten.today/post/graceful-k8s-delpoyments/

sftim commented 1 year ago

/triage accepted /sig network /priority backlog

sftim commented 1 year ago

https://github.com/kubernetes/kubernetes/pull/110191#issuecomment-1142294392 says:

From what I can find within the code, K8S has always assumed that a pod without a readiness probe stays in ready=true until it has fully terminated. This PR fixes the case where a pod is defined with a readiness probe, is being terminated, and needs to run the readiness probe on termination.

We should document that Pods that back Services should have a readiness probe defined, wherever this problem behavior (sending traffic to terminating pods) is not wanted. Also, the app code should try to fail the readiness probe as soon as Pod termination is signalled.

We should mention this in https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination and also update https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-readiness-probe to link there for details.

Optionally, also update https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/ to link to https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination

sftim commented 1 year ago

Extra credit: add a diagram somewhere (might be easier once https://github.com/kubernetes/website/pull/36675 has merged)

sftim commented 1 year ago

The overall idea here is to probe Pods for readiness, even during shutdown, and use that readiness information to drop backends out of Services more gracefully.

Because unexpected failures can happen, workloads should also be prepared to handle the case where a Pod disappears from the API, or the container fails hard, even whilst traffic directed to that Pod is in flight.

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

vaibhav2107 commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 4 months ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

Confirm that this issue is still relevant with /triage accepted (org members only)
Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 weeks ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

kubernetes / website

Elaborate the possibility of container termination to complete before Endpoints reconciliation #37223