Closed dcarley closed 1 year ago
Hi @dcarley! Thanks for your detailed report and reproduction steps.
We are already working to fix this behavior, but it's been challenging. As you mentioned, you can workaround this issue by changing dns_order
. Another option is not using FQDN hostnames, but that can decrease name resolution performance.
We will update this issue as soon as we come up with a fix.
That's great, thanks!
We use FQDNs everywhere because we've been bitten by Kubernetes DNS performance and Kong timer exhaustion, both separately and combined, a few times in the past.
The timer exhaustion issue has been fixed in Kong 3.x, but you are right, if you can stick to using FQDNs the performance is way better.
I observed timer exhaustion issue in production with Kong 3.0. Related issue https://github.com/Kong/kong/issues/9959
Dear contributor, We're closing this issue as there hasn't been any update to it for a long time. If the issue is still relevant in the latest version, please feel free to reopen it. We're more than happy to revisit it again. Your contribution is greatly appreciated! Please have a look at our pledge to the community for more information. Sincerely, Kong Gateway Team
Is there an existing issue for this?
Kong version (
$ kong version
)2.8.3 and 3.1.0
Current Behavior
When a plugin is configured to point a Kubernetes ClusterIP Service that has:
Then Kong:
I've tested this with two plugins, to see if the behaviour was coming from the HTTP client. The configs for these are linked further down.
zipkin
DNS lookups for SRV and search domains:
Connection to 4318 instead of 9411:
Plugin error because it's used the wrong Jaeger port:
datadog
DNS lookups for SRV and search domains:
Connection to 1234 instead of 8125:
Curiously it appears to retry against the other port:
Plugin error because it's used the wrong statsd port:
Expected Behavior
Kong should:
We have worked around this by setting dns_order to
LAST,A,CNAME
but the default behaviour was very surprising.Maybe SRV lookups should come last or be opt-in in a future major version?
Steps To Reproduce
You'll need to use 3 concurrent terminal windows so I've broken the instructions into those groups.
The YAML manifests can be found here: https://gist.github.com/dcarley/888134d7f4eea62efeb49c4bf33a1406
The whole process can be repeated for
datadog.yaml
instead ofzipkin.yaml
.Terminal 1
Start minikube:
Start minikube tunnel so that we can access the LoadBalancer Service (this might be Mac specific):
Terminal 2
Deploy the chart:
Export the address using the instructions from the Helm output:
Deploy an example app, backend, and plugin from:
Note the ClusterIP of the backend:
Terminal 3
Start a debug container to watch network traffic from the proxy:
Terminal 2
Make a request to trigger the plugin(s):
Anything else?
Minikube version:
Kubernetes version: