using emptyDir in the Envoy daemonset/deployment / draining concerns / connection failures

sendaie commented 2 years ago

Hello Contour community!

We run Contour in GKE, behind a L4 Internal TCP Load Balancer (which is brought up by GKE via our Envoy service, Type: LoadBalancer, annotation "cloud.google.com/load-balancer-type: Internal"). We have an issue, where we see our Envoy pods in our Contour configuration are not getting evacuated/deleted during a drain, either because it's a daemonset, or when we switch to a deployment because the pod is using local storage:

There are pending nodes to be drained:
...
cannot delete Pods with local storage (use --delete-emptydir-data to override): ctest/envoy-77d84d687f-5jrj5

The localstorage in this case is the emptyDir definition for "/config" directory, which is shared between Envoy, the initContainer and the shutdown-manager.

What we experience is that whenever an automatic node removal is taking place (cluster upgrade, node pool rollover, autoscaling scale down), our L4 Internal Load Balancer still forwards traffic to this pod/node until the LB's health check fails (which is up to 3 x 8 = 24 seconds, and this is out of our control).

We're assuming that if the Pod deletion would succeed, potentially that could signal the fact to the control plane that this Pod is gone, which would make sure that the GKE L4 LB would stop forwarding traffic to it.

We can reproduce the problem, and wondering whether anyone experienced this as well, and whether if there is a setup where Contour doesn't use local storage.

Thank you, sendai

youngnick commented 2 years ago

Thanks for this report @sendaie.

Ironically enough, we used to use Envoy's admin UI to do the shutdown over the network, which avoids this problem, but it led us to our first CVE (https://github.com/projectcontour/contour/security/advisories/GHSA-mjp8-x484-pm3r), which the EmptyDir solution was added to solve.

I think we'll need to spend some time investigating how best to handle this tricky set of requirements.

rajatvig commented 2 years ago

What we have done to work around this is create container using Envoy as the base and included the contour binary in it which is run both on startup (for the initContainer) case and on shutdown.

The run happens via a custom entry point and a preStop hook that terminates the pod after a few seconds of sleep to allow for zero downtime upgrades.

For clusters, where the Envoy pods autoscale, we have set the externalTrafficPolicy to Cluster to avoid packet loss when nodes running Envoy are removed. This enhancement when fully merged would remove that requirement.

stevesloka commented 2 years ago

The daemonset is what is blocking you on the volume bits. Have a look to see if the deployment model would work. There's an example in the examples dir. It will allow for a clean termination when draining a node.

rajatvig commented 2 years ago

@stevesloka We moved from a Daemonset to a Deployment to address the issues around clean termination but ran into draining issues due to the emptyDir.

Moving into one container makes it almost like other ingress controller setups with only one container and no mounts over emptyDirs.

stevesloka commented 2 years ago

I'm surprised you get the emptydir notice with a deployment, I'm not aware of that happening.

What version of k8s are you on?

rajatvig commented 2 years ago

We are using 1.21.

The default drain options on a node do not include options to drain pods with mounts via emptyDir.

There is some discussion in https://github.com/kubernetes/kubernetes/issues/80228

github-actions[bot] commented 1 year ago

The Contour project currently lacks enough contributors to adequately respond to all Issues.

This bot triages Issues according to the following rules:

After 60d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, the Issue is closed

You can:

Mark this Issue as fresh by commenting
Close this Issue
Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

github-actions[bot] commented 1 year ago

The Contour project currently lacks enough contributors to adequately respond to all Issues.

This bot triages Issues according to the following rules:

After 60d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, the Issue is closed

You can:

Mark this Issue as fresh by commenting
Close this Issue
Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

projectcontour / contour

using emptyDir in the Envoy daemonset/deployment / draining concerns / connection failures #4322