Plugin DNS lookups (Kubernetes, SRV records, incorrect port, search domains)

dcarley commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues

Kong version (`$ kong version`)

2.8.3 and 3.1.0

Current Behavior

When a plugin is configured to point a Kubernetes ClusterIP Service that has:

more than one port
a hostname with a trailing dot, to prevent additional DNS lookups for search domains

Then Kong:

uses a different port from the ClusterIP Service, which causes plugin errors and data loss
makes additional DNS lookups for search domains, which uses additional resources

I've tested this with two plugins, to see if the behaviour was coming from the HTTP client. The configs for these are linked further down.

zipkin

DNS lookups for SRV and search domains:

IP 172.17.0.3.33336 > 10.96.0.10.53: 13718+ SRV? jaeger.kube-system.svc.cluster.local. (54)
IP 10.96.0.10.53 > 172.17.0.3.33336: 13718*- 3/0/1 SRV jaeger.kube-system.svc.cluster.local.:4318 0 33, SRV jaeger.kube-system.svc.cluster.local.:9411 0 33, SRV jaeger.kube-system.svc.cluster.local.:16686 0 33 (382)
IP 172.17.0.3.33216 > 10.96.0.10.53: 57462+ A? jaeger.kube-system.svc.cluster.local.default.svc.cluster.local. (80)
IP 10.96.0.10.53 > 172.17.0.3.33216: 57462 NXDomain*- 0/1/0 (173)
IP 172.17.0.3.56435 > 10.96.0.10.53: 2101+ A? jaeger.kube-system.svc.cluster.local.svc.cluster.local. (72)
IP 10.96.0.10.53 > 172.17.0.3.56435: 2101 NXDomain*- 0/1/0 (165)
IP 172.17.0.3.34735 > 10.96.0.10.53: 55158+ A? jaeger.kube-system.svc.cluster.local.cluster.local. (68)
IP 10.96.0.10.53 > 172.17.0.3.34735: 55158 NXDomain*- 0/1/0 (161)

Connection to 4318 instead of 9411:

IP 172.17.0.3.34136 > 10.104.73.83.4318: Flags [S], seq 2021488018, win 64240, options [mss 1460,sackOK,TS val 1761064677 ecr 0,nop,wscale 7], length 0
IP 10.104.73.83.4318 > 172.17.0.3.34136: Flags [S.], seq 3655606717, ack 2021488019, win 65160, options [mss 1460,sackOK,TS val 1382267756 ecr 1761064677,nop,wscale 7], length 0

Plugin error because it's used the wrong Jaeger port:

[error] 1127#0: *7949 [kong] handler.lua:99 reporter flush failed: 404 Not Found, context: ngx.timer

datadog

DNS lookups for SRV and search domains:

IP 172.17.0.3.60069 > 10.96.0.10.53: 32873+ SRV? metrics.kube-system.svc.cluster.local. (55)
IP 10.96.0.10.53 > 172.17.0.3.60069: 32873*- 2/0/1 SRV metrics.kube-system.svc.cluster.local.:1234 0 50, SRV metrics.kube-system.svc.cluster.local.:8125 0 50 (296)
IP 172.17.0.3.37335 > 10.96.0.10.53: 61911+ A? metrics.kube-system.svc.cluster.local.default.svc.cluster.local. (81)
IP 10.96.0.10.53 > 172.17.0.3.37335: 61911 NXDomain*- 0/1/0 (174)
IP 172.17.0.3.36719 > 10.96.0.10.53: 48+ A? metrics.kube-system.svc.cluster.local.svc.cluster.local. (73)
IP 10.96.0.10.53 > 172.17.0.3.36719: 48 NXDomain*- 0/1/0 (166)
IP 172.17.0.3.54753 > 10.96.0.10.53: 57483+ A? metrics.kube-system.svc.cluster.local.cluster.local. (69)
IP 10.96.0.10.53 > 172.17.0.3.54753: 57483 NXDomain*- 0/1/0 (162)

Connection to 1234 instead of 8125:

IP 172.17.0.3.49517 > 10.99.80.78.1234: UDP, length 75
IP 10.99.80.78 > 172.17.0.3: ICMP 10.99.80.78 udp port 1234 unreachable, length 111
IP 172.17.0.3.49517 > 10.99.80.78.1234: UDP, length 77
IP 10.99.80.78 > 172.17.0.3: ICMP 10.99.80.78 udp port 1234 unreachable, length 113
IP 172.17.0.3.49517 > 10.99.80.78.1234: UDP, length 79
IP 10.99.80.78 > 172.17.0.3: ICMP 10.99.80.78 udp port 1234 unreachable, length 115

Curiously it appears to retry against the other port:

IP 172.17.0.3.42676 > 10.99.80.78.8125: UDP, length 75
IP 172.17.0.3.42676 > 10.99.80.78.8125: UDP, length 70
IP 172.17.0.3.42676 > 10.99.80.78.8125: UDP, length 77
IP 172.17.0.3.42676 > 10.99.80.78.8125: UDP, length 78
IP 172.17.0.3.42676 > 10.99.80.78.8125: UDP, length 79
IP 172.17.0.3.42676 > 10.99.80.78.8125: UDP, length 75

Plugin error because it's used the wrong statsd port:

[error] 1128#0: *70 send() failed (111: Connection refused), context: ngx.timer
[error] 1128#0: *70 [kong] statsd_logger.lua:82 failed to send data to metrics.kube-system.svc.cluster.local.:8125: connection refused, context: ngx.timer

Expected Behavior

Kong should:

use the port specified in the plugin configuration
not perform additional DNS lookups for FQDNs that have trailing dots

We have worked around this by setting dns_order to LAST,A,CNAME but the default behaviour was very surprising.

Maybe SRV lookups should come last or be opt-in in a future major version?

Steps To Reproduce

You'll need to use 3 concurrent terminal windows so I've broken the instructions into those groups.

The YAML manifests can be found here: https://gist.github.com/dcarley/888134d7f4eea62efeb49c4bf33a1406

The whole process can be repeated for datadog.yaml instead of zipkin.yaml.

Terminal 1

Start minikube:

minikube start

Start minikube tunnel so that we can access the LoadBalancer Service (this might be Mac specific):

sudo --validate && minikube tunnel

Terminal 2

Deploy the chart:

helm upgrade test kong/kong --install --version 2.14.0 --wait

Export the address using the instructions from the Helm output:

HOST=$(kubectl get svc --namespace default test-kong-proxy -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
PORT=$(kubectl get svc --namespace default test-kong-proxy -o jsonpath='{.spec.ports[0].port}')
export PROXY_IP=${HOST}:${PORT}

Deploy an example app, backend, and plugin from:

kubectl apply -f httpbin.yaml
kubectl apply -f zipkin.yaml

Note the ClusterIP of the backend:

kubectl get service -n kube-system

Terminal 3

Start a debug container to watch network traffic from the proxy:

kubectl debug -it test-kong-6879b698d4-f4dhh --target proxy --image alpine
apk add tcpdump
tcpdump -nt port 53 or host <ClusterIP>

Terminal 2

Make a request to trigger the plugin(s):

curl $PROXY_IP/status/201

Anything else?

Minikube version:

% minikube version
minikube version: v1.25.2
commit: v1.25.2

Kubernetes version:

% kubectl version                    
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"archive", BuildDate:"1980-01-01T00:00:00Z", GoVersion:"go1.17.13", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/arm64"}

locao commented 1 year ago

Hi @dcarley! Thanks for your detailed report and reproduction steps.

We are already working to fix this behavior, but it's been challenging. As you mentioned, you can workaround this issue by changing dns_order. Another option is not using FQDN hostnames, but that can decrease name resolution performance.

We will update this issue as soon as we come up with a fix.

dcarley commented 1 year ago

That's great, thanks!

We use FQDNs everywhere because we've been bitten by Kubernetes DNS performance and Kong timer exhaustion, both separately and combined, a few times in the past.

locao commented 1 year ago

The timer exhaustion issue has been fixed in Kong 3.x, but you are right, if you can stick to using FQDNs the performance is way better.

surenraju-careem commented 1 year ago

I observed timer exhaustion issue in production with Kong 3.0. Related issue https://github.com/Kong/kong/issues/9959

StarlightIbuki commented 1 year ago

Dear contributor, We're closing this issue as there hasn't been any update to it for a long time. If the issue is still relevant in the latest version, please feel free to reopen it. We're more than happy to revisit it again. Your contribution is greatly appreciated! Please have a look at our pledge to the community for more information. Sincerely, Kong Gateway Team

Kong / kong