envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.14k stars 4.83k forks source link

Envoy POD Stuck with DNS time outs in Kuberenetes #37446

Open kiranmaiem79 opened 1 day ago

kiranmaiem79 commented 1 day ago

If you are reporting any crash or any potential security issue, do not open an issue in this repo. Please report the issue via emailing envoy-security@googlegroups.com where the issue will be triaged appropriately.

Title: Envoy pod stuck with dns time outs in kuberenetes when kubedns service restarts and point to a new IP

Description: We deployed Envoy proxy as a forward http proxy container in kubernetes with "STRICT DNS" resolution. We observed that after kubernetes kubedns service restartes with new IP, Envoy pod unable to detect new kubedns ip during dns resolution. It fails all requests with dns timeout error. Is there any configuration in Envoy to forcefully reset dns socket after a no of timeouts so that it can check the iptables and pick new up new DNS server IP for future DNS resolutions?

Now our only option to recover envou is to restart the container.

[optional Relevant Links:] Same issue as https://github.com/envoyproxy/envoy/issues/7965

ggreenway commented 1 day ago

cc @mattklein123 @yanavlasov