Kong / kubernetes-ingress-controller

:gorilla: Kong for Kubernetes: The official Ingress Controller for Kubernetes.
https://docs.konghq.com/kubernetes-ingress-controller/
Apache License 2.0
2.22k stars 592 forks source link

Potential footgun: Very slow start up with many secrets in cluster #1447

Closed sandermvanvliet closed 3 years ago

sandermvanvliet commented 3 years ago

NOTE: GitHub issues are reserved for bug reports only. For anything else, please join the conversation in Kong Nation https://discuss.konghq.com/c/kubernetes.


Summary

In a cluster with a large amount of secrets, Kong ingress controller is very slow to start which leads to many forced restarts as pods are killed by the Kubernetes scheduler

Kong Ingress controller version

1.3.1

Kong or Kong Enterprise version

2.4.1

Kubernetes version

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:35:51Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.16", GitCommit:"46bb3f4471faba8c6a59ea27810ed4c425e44aec", GitTreeState:"clean", BuildDate:"2021-03-10T23:35:58Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Environment

What happened

We've been tracking down an issue where the ingress controller pod would take many attempts to start up because it would be killed by the scheduler as readiness and liveness probes were failing.

The reason for the failure was that the /healthz endpoint wasn't started yet because the ingress controller was busy collecting the necessary objects from the Kubernetes API.

We were triggered to look in this direction as the metrics we collect through Prometheus showed the Read SLI on secrets spiking whenever the Kong ingress controller attempted to start: image In the plot you can see that this is in some cases taking minutes which is quite bad to say the least.

Based on this metric we've checked to see which secrets the ingress controller was requesting and we've found that it was retrieving all secrets form all namespaces (we don't have a watchNamespace configured as our ingresses and plugins live in various namespaces).

We deploy our applications using Helm3 which stores deployments as secrets (and keeps the last 10 per app by default). This amounts to a large amount of secrets in the cluster. A quick count gave us ~5000 secrets of which it's estimated that ~90% are Helm secrets.

To confirm our suspicions we've compiled a version of the ingress controller with the client-go package vendored in where we applied a label filter on the Secrets informer. The label filter was defined as owner!=helm.

After building the docker container and deploying this into our cluster the start up time was < 10s, previously even 240 seconds grace on the readiness/liveness probe wasn't enough.

With this we now have a stable Kong ingress again, all instances are working fine.

I'm not sure if this is a real bug but it's something that can cause a lot of confusion (it sure did for us!) As far as I can tell I haven't come across this in the documentation, but it's also very likely I didn't look in the right places.

With this issue we wanted to ensure people are aware of this and if needed we can provide additional info.

ccfishk commented 3 years ago

To locate whether a real bug or configuration, kindly provide more context

sandermvanvliet commented 3 years ago

@ccfishk you can find an example log where start up takes long here: https://gist.github.com/sandermvanvliet/1cc87865eb34c6a6b8c2cf55b7a51213 This was run with --log-level=debug and -v5.

Environment of ingress-controller:

CONTROLLER_ELECTION_ID: kong-ingress-controller-leader-kong-jedlixcom
CONTROLLER_INGRESS_CLASS: kong-jedlixcom
CONTROLLER_KONG_ADMIN_TLS_SKIP_VERIFY: true
CONTROLLER_KONG_ADMIN_URL: https://localhost:8444
CONTROLLER_KONG_LOG_LEVEL: debug
CONTROLLER_PUBLISH_SERVICE: prod-ingress-controllers/jedlixcom-kong-kong-proxy
KONG_LOG_LEVEL: debug
POD_NAME: fieldRef(v1:metadata.name)
POD_NAMESPACE: fieldRef(v1:metadata.namespace)

Environment of proxy:

KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_ADMIN_GUI_ACCESS_LOG: /dev/stdout
KONG_ADMIN_GUI_ERROR_LOG: /dev/stderr
KONG_ADMIN_LISTEN: 127.0.0.1:8444 http2 ssl
KONG_CLUSTER_LISTEN: off
KONG_DATABASE: off
KONG_KIC: on
KONG_LOG_LEVEL: debug
KONG_LUA_PACKAGE_PATH: /opt/?.lua;/opt/?/init.lua;;
KONG_NGINX_DAEMON: off
KONG_NGINX_WORKER_PROCESSES: 2
KONG_PLUGINS: bundled
KONG_PORTAL_API_ACCESS_LOG: /dev/stdout
KONG_PORTAL_API_ERROR_LOG: /dev/stderr
KONG_PORT_MAPS: 443:8443
KONG_PREFIX: /kong_prefix/
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_PROXY_LISTEN: 0.0.0.0:8443 http2 ssl
KONG_SSL_CERT: /etc/secrets/tls-domaincom/tls.crt
KONG_SSL_CERT_KEY: /etc/secrets/tls-domaincom/tls.key
KONG_STATUS_LISTEN: 0.0.0.0:8100
KONG_STREAM_LISTEN: off

Output of kubectl get all --all-namespaces I'm not going to share here as that's a bit too much info. Could you tell me what specifically you're looking for there?

ccfishk commented 3 years ago

To answer your question: in this case, why KIC1.3.0 is walking through all secrets before start up. I just bumped same image in my environment, the output is different. Very likely, my setup is more pure.

sandermvanvliet commented 3 years ago

By far the amount of secrets is the largest group of resources in our cluster compared to say pods or endpoints.

My guess is that with ListAndWatch being called somewhere deep down it tries to subscribe to changes on those secrets which apparently doesn’t scale very well.

I haven’t had much luck reproducing this but I’ll set up a cluster to test with a single pod with an API that has a Kong ingress. Then add in a ton of secrets and see if it shows the same behaviour.

At the very least that’ll tell us if our suspicion this has to do with secrets is actually true

sandermvanvliet commented 3 years ago

Still, adding in that label filter makes it start up quick so I guess it’s related.

Unless you mean that it shouldn’t even iterate over all those secrets in the first place.

ccfishk commented 3 years ago

@sandermvanvliet did you get a chance to add in a ton of secrets and see if it shows the same behavior ? Sorry to get back to you today, I am also trying to figure out what would be the root cause of this scenario ?

ccfishk commented 3 years ago

dumped 5k secrets on cluster using golang corev1 secrets creation, did not trigger the same behavior, KIC pod is able to be running status within a second.

Unless this is reproducible in certain configuration, we'll not take this a bug.