Open Izzette opened 1 month ago
This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
/remove-kind bug
The bug label can be re-applied after a developer accepts that the triage as a bug.
There are multiple pieces of information to look at ;
Even a expensive commercial version of the Nginx webserver, is not free from the problem of impacting traffic, when a the reload of nginx.conf occurs. And in the case of above average number of ingresses and rules, we already have issues and info on the traffic impact, and there is nothing we can do about it today. That is conclusive. If you add that the number of changes to the nginx.conf being reloaded is also large, then the disruption is even more unavoidable, until the config reconciles.
There are other other users of the controller who do find the optimum config to get reliability. But none of them reported the ingress-nginx controller service --type as ClusterIP.. You are showing service --type ClusterIP and looks like generating changes and load etc etc from inside the cluster, over ClusterIP. We acknowledge that this will break traffic and regardless of what you think are bugs/problems and their solutions, the CPU/Memory/I/O and their speeds are directly co-related to the various race conditions anyone can cook up. What the project is working on is to split the control-plane from the data-plane so that both the performance & also the security is improved. Look at current open PRs related.
Also, there have been attempts to play with various timeouts by other users who reported similar expectation. But since that is a extremely specific config for each environment and each use-case of a given-environment, I think that is one area to explore.
... ingress-nginx controller service --type as ClusterIP.
Actually, this issue impacts us in our specific case where we're not using the service cluster IP, but rather the external load balancer is hitting the pod IP directly through the GCP LB NEG. It's calling the same health check as kubernetes, and the ingress-nginx-controller is returning erroneously once again that it is ready, when it has not yet loaded the ingress config. The problem is that the "initial sync" of the kubernetes state for ingresses isn't completed until after the controller reports that the initial sync is complete and the dynamic loadbalancer is initialized.
The k8s Service implementation for the ingress or backends are not relevant in this case. 404 is being returned by nginx because the appropriate ingress isn't populated in the lua shared dictionary nor are the servers/locations templated in the nginx configuration.
when a the reload of nginx.conf occurs
Actually, we don't face any problems when a reload of nginx occurs. Rather, multiple reloads on startup of ingress-nginx pods are merely a symptom of the specific implementation in ingress-nginx-controller and it's interaction with the kubernetes API (the Golang side) resulting in the bug.
@Izzette thank you for the comments. Helps.
This is a vast topic so discussions are only possible with specifics so I am looking at your reproduce steps. And we are rootcausing the 404
k describe ing
for hello-700 or k get ep
at the time of the 400. Very likely this showed no endpointsSo above comments from me are directed at the tests but even with any other tests, if there is starvation of resources like cpu/mem etc etc that I listed earlier, there will be events leading to the endpointSlice becoming empty. It is expected.
Even though the results will be the same, I would do these tests by first installing Metallb in the kind cluster. And specifying the docker container ipaddress as the starting and ending of the ipaddress pool. Then I would make /etc/hosts entry for hello-700 on the host running kind. And send curl request from the host shell to hello-700.example.com. Simulates your use-case closer. (not that results will be any different though).
lastly, to repeat, if you starve the cluster of cpu/mem/bandwidth/conntrack/inodes & i/o and also generate load on api-server and top it with a rollout, the /healthz endpoint of the controller may respond ok and thus move pod to ready state. I am not surprised.
And the only choice we have at this time is a play on the timeouts, specifically increasing the delaySeconds etc etc. And all other configurables related to probe behaviour.
And just for sanity sake, the tests will be the same if you use kubectl create deploy httpd --image httpd:alpine --port 80 && kubectl expose deploy httpd && kubectl create ing httpd --class nginx --rule httpd.example.com/"*"=httpd:80
. Just so that ingress is for HTTP/HTTPS so once again a closer simulation (not that it will matter much). Thanks
Ah forgot to mention, I also have interest to use 5 replicas and set min-available to 3. Then do the load and rollout as per your design.
I am able to reproduce with httpd.example.com
using the backend image docker.io/library/httpd:alpine
just fine. I of course, need other ingresses in order for them to be loaded before httpd.example.com
generating the partial config.
(kind-kind/default) 0 ✓ izzi@Isabelles-MacBook-Pro.local ~ $ kubectl run --namespace default --context kind-kind --tty --stdin --restart=Never --command --image nicolaka/netshoot:latest test -- bash
test:~# get_url() {
curl \
--show-error \
--verbose \
--silent \
--output /tmp/curl-body.dat \
--write-out '%{http_code}\n' \
--header 'Host: httpd.example.com' \
http://ingress-nginx-controller.ingress-nginx.svc.cluster.local. \
2> /tmp/curl-error.log
}
# Curl httpd.example.com ingress until the http status is 404
while [ "$(get_url)" != 404 ]; do
# Nothing at all, as quickly as possible
:
done
# Print the last request error log and body.
cat /tmp/curl-error.log /tmp/curl-body.dat
* Host ingress-nginx-controller.ingress-nginx.svc.cluster.local.:80 was resolved.
* IPv6: (none)
* IPv4: 10.96.199.65
* Trying 10.96.199.65:80...
* Connected to ingress-nginx-controller.ingress-nginx.svc.cluster.local. (10.96.199.65) port 80
> GET / HTTP/1.1
> Host: httpd.example.com
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 404 Not Found
< Date: Thu, 17 Oct 2024 07:19:55 GMT
< Content-Type: text/html
< Content-Length: 146
< Connection: keep-alive
<
{ [146 bytes data]
* Connection #0 to host ingress-nginx-controller.ingress-nginx.svc.cluster.local. left intact
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
test:~#
while I run in a different shell:
(kind-kind/default) 0 ✓ izzi@Isabelles-MBP.barityo.lan ~ $ kubectl --namespace ingress-nginx --context kind-kind rollout restart deployment ingress-nginx-controller
deployment.apps/ingress-nginx-controller restarted
Before the rollout restart, this works fine of course, as with the other backend.
if I redeploy ingress-nginx with replicas 5 and maxUnavailable 2 I can also reproduce this issue:
(kind-kind/default) 0 ✓ izzi@Isabelles-MBP.barityo.lan ~ $ helm upgrade --install ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--version 4.11.3 \
--set controller.admissionWebhooks.enabled=false \
--set controller.replicaCount=5,controller.autoscaling.minAvailable=3 \
--set controller.livenessProbe.initialDelaySeconds=0,controller.livenessProbe.periodSeconds=1,controller.livenessProbe.timeoutSeconds=10,controller.livenessProbe.failureThreshold=600 \
--set controller.readinessProbe.initialDelaySeconds=0,controller.readinessProbe.periodSeconds=1,controller.readinessProbe.timeoutSeconds=10,controller.readinessProbe.failureThreshold=600 \
--namespace ingress-nginx --create-namespace
Release "ingress-nginx" has been upgraded. Happy Helming!
NAME: ingress-nginx
LAST DEPLOYED: Thu Oct 17 09:27:37 2024
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 13
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the load balancer IP to be available.
You can watch the status by running 'kubectl get service --namespace ingress-nginx ingress-nginx-controller --output wide --watch'
An example Ingress that makes use of the controller:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: example
namespace: foo
spec:
ingressClassName: nginx
rules:
- host: www.example.com
http:
paths:
- pathType: Prefix
backend:
service:
name: exampleService
port:
number: 80
path: /
# This section is only required if TLS is to be enabled for the Ingress
tls:
- hosts:
- www.example.com
secretName: example-tls
If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:
apiVersion: v1
kind: Secret
metadata:
name: example-tls
namespace: foo
data:
tls.crt: <base64 encoded cert>
tls.key: <base64 encoded key>
type: kubernetes.io/tls
(kind-kind/default) 0 ✓ izzi@Isabelles-MBP.barityo.lan ~ $ kubectl --namespace ingress-nginx get pods
NAME READY STATUS RESTARTS AGE
ingress-nginx-controller-9df47b74c-htjs6 1/1 Running 0 6s
ingress-nginx-controller-9df47b74c-nncsr 1/1 Running 0 6s
ingress-nginx-controller-9df47b74c-rbfxb 1/1 Running 0 7m53s
ingress-nginx-controller-9df47b74c-tblkt 1/1 Running 0 6s
ingress-nginx-controller-9df47b74c-zcd99 1/1 Running 0 6s
(kind-kind/default) 0 ✓ izzi@Isabelles-MBP.barityo.lan ~ $ kubectl --namespace ingress-nginx --co
ntext kind-kind rollout restart deployment/ingress-nginx-controller
deployment.apps/ingress-nginx-controller restarted
While in my test pod:
test:~# get_url() {
curl \
--show-error \
--verbose \
--silent \
--output /tmp/curl-body.dat \
--write-out '%{http_code}\n' \
--header 'Host: httpd.example.com' \
http://ingress-nginx-controller.ingress-nginx.svc.cluster.local. \
2> /tmp/curl-error.log
}
# Curl httpd.example.com ingress until the http status is 404
while [ "$(get_url)" != 404 ]; do
# Nothing at all, as quickly as possible
:
done
# Print the last request error log and body.
cat /tmp/curl-error.log /tmp/curl-body.dat
* Host ingress-nginx-controller.ingress-nginx.svc.cluster.local.:80 was resolved.
* IPv6: (none)
* IPv4: 10.96.199.65
* Trying 10.96.199.65:80...
* Connected to ingress-nginx-controller.ingress-nginx.svc.cluster.local. (10.96.199.65) port 80
> GET / HTTP/1.1
> Host: httpd.example.com
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 404 Not Found
< Date: Thu, 17 Oct 2024 07:29:21 GMT
< Content-Type: text/html
< Content-Length: 146
< Connection: keep-alive
<
{ [146 bytes data]
* Connection #0 to host ingress-nginx-controller.ingress-nginx.svc.cluster.local. left intact
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
test:~#
hi, its so very helpful when the data is so abundant and precise.
I have a ton of things to communicate here after this data. But I was wondering if we can get on a screenshare to get the precise data, that is kind of more relatable, from the perspective of creating action-items for developers of this project.
Any chance you can meet on meet.jit.si
I am also on slack, if that works for you. The advantage being realtime conversation possible if it adds value.
What happened:
Upon ingress-nginx pod boot-up sequence on Kubernetes, our clients are receiving back HTTP 404 responses from nginx itself for HTTP path that are declared in some of our Ingresses. This situation only happens when the pod is booting up and not when a hot reload sequence is initiated.
While the pod is marked as Ready in Kubernetes, we suspect that the nginx configuration is not fully loaded and some of the requests are forwarded to the upstream-default-backend upstream (see screenshot below and the pod logs in CSV).
For reference we defined quite a lot of Ingresses in our cluster with a lot of different paths. The resulting nginx configuration is quite heavy to load as its approximately 67MB.
Requests served by the default backend by each pod just after it starts up
Count of pods in “ready” state
You can see in the above two graphs that after 3 out of the 4 pods in the ingress-nginx-external-controller-7c8576cd ReplicaSet become Ready (ingress-nginx-controller /healthz endpoint returns 200) several thousand requests are served by the default backend over the course of ~30s. This occurs even after the 10s initial delay for the readiness and liveness probes has been surpassed.
ingress-nginx-controller pod logs after startup. Notice the multiple reloads and the change of backend after the last reload.
```csv Date,Pod Name,Message 2024-10-15T13:37:44.477Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:44.477208 8 main.go:205] ""Creating API client"" host=""https://100.76.0.1:443""" 2024-10-15T13:37:44.483Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:44.483986 8 main.go:248] ""Running in Kubernetes cluster"" major=""1"" minor=""30"" git=""v1.30.5-gke.1014001"" state=""clean"" commit=""c9d757f7eeb6b159f3a64f6cb3bf7007d65c1f19"" platform=""linux/amd64""" 2024-10-15T13:37:44.570Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:44.570002 8 main.go:101] ""SSL fake certificate created"" file=""/etc/ingress-controller/ssl/default-fake-certificate.pem""" 2024-10-15T13:37:44.627Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:44.627215 8 nginx.go:271] ""Starting NGINX Ingress controller""" 2024-10-15T13:37:44.630Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:44.630735 8 store.go:535] ""ignoring ingressclass as the spec.controller is not the same of this ingress"" ingressclass=""nginx-internal""" 2024-10-15T13:37:44.845Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:44.845077 8 event.go:377] Event(v1.ObjectReference{Kind:""ConfigMap"", Namespace:""network"", Name:""ingress-nginx-external-controller"", UID:""8953e6b2-9833-4fb7-8339-a362a015f525"", APIVersion:""v1"", ResourceVersion:""1073189370"", FieldPath:""""}): type: 'Normal' reason: 'CREATE' ConfigMap network/ingress-nginx-external-controller" 2024-10-15T13:37:46.028Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:46.028658 8 nginx.go:317] ""Starting NGINX process""" 2024-10-15T13:37:46.029Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",I1015 13:37:46.029013 8 leaderelection.go:254] attempting to acquire leader lease network/ingress-nginx-external-leader... 2024-10-15T13:37:46.038Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:46.038088 8 status.go:85] ""New leader elected"" identity=""ingress-nginx-external-controller-865c5d89b5-ksxl5""" 2024-10-15T13:37:46.835Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:46.835434 8 controller.go:213] ""Backend successfully reloaded""" 2024-10-15T13:37:46.835Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:46.835571 8 controller.go:224] ""Initial sync, sleeping for 1 second""" 2024-10-15T13:37:46.835Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:46.835760 8 event.go:377] Event(v1.ObjectReference{Kind:""Pod"", Namespace:""network"", Name:""ingress-nginx-external-controller-56bbcdd967-qg7p7"", UID:""fbe312af-999f-48ab-952a-8dc437a3d4bc"", APIVersion:""v1"", ResourceVersion:""1079569510"", FieldPath:""""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration" 2024-10-15T13:37:49.728Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:49.728678 8 controller.go:193] ""Configuration changes detected, backend reload required""" 2024-10-15T13:37:51Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:58.178Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:58.178206 8 controller.go:213] ""Backend successfully reloaded""" 2024-10-15T13:37:58.178Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:37:58.178586 8 event.go:377] Event(v1.ObjectReference{Kind:""Pod"", Namespace:""network"", Name:""ingress-nginx-external-controller-56bbcdd967-qg7p7"", UID:""fbe312af-999f-48ab-952a-8dc437a3d4bc"", APIVersion:""v1"", ResourceVersion:""1079569510"", FieldPath:""""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:37:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:38:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:38:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:38:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:38:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:38:59Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":404,""method"":""GET"",""proxyUpstreamName"":""upstream-default-backend""}" 2024-10-15T13:38:25.571Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:38:25.571800 8 status.go:85] ""New leader elected"" identity=""ingress-nginx-external-controller-865c5d89b5-cfhfc""" 2024-10-15T13:39:20.376Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",------------------------------------------------------------------------------- 2024-10-15T13:39:20.376Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",NGINX Ingress controller 2024-10-15T13:39:20.376Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",Release: v1.11.3 2024-10-15T13:39:20.376Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",Build: 0106de65cfccb74405a6dfa7d9daffc6f0a6ef1a 2024-10-15T13:39:20.376Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",Repository: https://github.com/kubernetes/ingress-nginx 2024-10-15T13:39:20.376Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",nginx version: nginx/1.25.5 2024-10-15T13:39:20.376Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",------------------------------------------------------------------------------- 2024-10-15T13:39:44.157Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""",I1015 13:39:44.157221 8 leaderelection.go:268] successfully acquired lease network/ingress-nginx-external-leader 2024-10-15T13:39:44.157Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:39:44.157521 8 status.go:85] ""New leader elected"" identity=""ingress-nginx-external-controller-56bbcdd967-qg7p7""" 2024-10-15T13:59:07.025Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:59:07.025134 8 controller.go:193] ""Configuration changes detected, backend reload required""" 2024-10-15T13:59:15.602Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:59:15.602900 8 controller.go:213] ""Backend successfully reloaded""" 2024-10-15T13:59:15.603Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","I1015 13:59:15.603247 8 event.go:377] Event(v1.ObjectReference{Kind:""Pod"", Namespace:""network"", Name:""ingress-nginx-external-controller-56bbcdd967-qg7p7"", UID:""fbe312af-999f-48ab-952a-8dc437a3d4bc"", APIVersion:""v1"", ResourceVersion:""1079569510"", FieldPath:""""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration" 2024-10-15T13:59:19Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":200,""method"":""GET"",""proxyUpstreamName"":""api-gateway""}" 2024-10-15T13:59:19Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":200,""method"":""GET"",""proxyUpstreamName"":""api-gateway""}" 2024-10-15T13:59:19Z,"""ingress-nginx-external-controller-56bbcdd967-qg7p7""","{""Attributes"":{""service"":{""name"":""nginx-ingress-controller""},""http"":{""status_code"":200,""method"":""GET"",""proxyUpstreamName"":""api-gateway""}" ```What you expected to happen:
The Ingress-nginx pod should not be marked as ready while still loading its configuration and we should not get HTTP 404 from nginx itself.
NGINX Ingress controller version
Kubernetes version (use kubectl version): Server Version: v1.30.5-gke.1014001
Environment:
Cloud provider or hardware configuration: Google Cloud Platform / GKE / GCE
OS (e.g. from /etc/os-release): https://cloud.google.com/container-optimized-os/docs/release-notes/m113#cos-113-18244-151-14_
Kernel (e.g. uname -a): https://cos.googlesource.com/third_party/kernel/+/f2b7676b27982b8ce21e62319fceb9a0fd4131c5
Install tools: GKE
Basic cluster related info:
How was the ingress-nginx-controller installed:
ingress-nginx-controller is installed with ArgoCD using helm templating.
Current State of the controller:
`kubectl describe ingressclasses
kubectl -n <ingresscontrollernamespace> get all -A -o wide
kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>
kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>
How to reproduce this issue:
Install a kind cluster
Install the ingress controller
Install the ingress controller with modified liveness/readiness timings to improve the reproducibility.\ Admission webhooks are disabled here to avoid swamping ingress-nginx when creating the large number of ingresses required to reproduce this bug.
Create a simple service in the default namespace
Here we're creating a simple service using nc that will always return 200.It's not protocol aware, just returning a static body.
Apply the below manifests with:
Manifests
```yaml --- apiVersion: v1 data: server.sh: "#!/bin/sh\ncat <<- EOS\n\tHTTP/1.1 200 OK\r\n\tContent-Length: 14\r\n\r\n\tHello World!\r\nEOS" kind: ConfigMap metadata: name: hello namespace: default --- apiVersion: apps/v1 kind: Deployment metadata: name: hello namespace: default spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: hello strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: hello spec: containers: - command: - nc - -lkvp - "8080" - -e - serve image: docker.io/alpine:3.14 imagePullPolicy: IfNotPresent name: hello ports: - containerPort: 8080 name: http protocol: TCP terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /usr/local/bin/serve name: programs subPath: server.sh dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler terminationGracePeriodSeconds: 30 volumes: - configMap: defaultMode: 511 name: hello name: programs --- apiVersion: v1 kind: Service metadata: name: hello namespace: default spec: ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: http port: 80 targetPort: http selector: app: hello ```Create 1k ingresses pointing to this service
Run the below python script to create 1000 ingresses (hello-[0-999].example.com) with our simple service as the backend.
You will have to wait some time for ingress-nginx to update it's config with all these changes.
Create a test pod to confirm the service and ingress are alive.
In the console on this pod, run the following to confirm we have a stable environment:
You should see a successful response for each of them.
Generate load on the kubernetes API server / etcd
In order to reproduce this bug (reliably?) some load needs to be added to kubernetes itself.
I use kube-burner here to create the load.\ You will need to wait until PUT/PATCH/DELETE commands are being run on existing kubernetes objects in order to reproduce.
Below is my configuration.\ During the first job, objects are created, and this doesn't seem to be enough to reproduce the bug.\ Wait until the api-intensive-patch job has started before continuing.
Configuration
`./api-intensive.yml`: ```yaml --- jobs: - name: api-intensive jobIterations: 50 qps: 4 burst: 4 namespacedIterations: true namespace: api-intensive podWait: false cleanup: true waitWhenFinished: true objects: - objectTemplate: templates/deployment.yaml replicas: 1 - objectTemplate: templates/configmap.yaml replicas: 1 - objectTemplate: templates/secret.yaml replicas: 1 - objectTemplate: templates/service.yaml replicas: 1 - name: api-intensive-patch jobType: patch jobIterations: 10 qps: 2 burst: 2 objects: - kind: Deployment objectTemplate: templates/deployment_patch_add_label.json labelSelector: {kube-burner-job: api-intensive} patchType: "application/json-patch+json" apiVersion: apps/v1 - kind: Deployment objectTemplate: templates/deployment_patch_add_pod_2.yaml labelSelector: {kube-burner-job: api-intensive} patchType: "application/apply-patch+yaml" apiVersion: apps/v1 - kind: Deployment objectTemplate: templates/deployment_patch_add_label.yaml labelSelector: {kube-burner-job: api-intensive} patchType: "application/strategic-merge-patch+json" apiVersion: apps/v1 - name: api-intensive-remove qps: 2 burst: 2 jobType: delete waitForDeletion: true objects: - kind: Deployment labelSelector: {kube-burner-job: api-intensive} apiVersion: apps/v1 - name: ensure-pods-removal qps: 10 burst: 10 jobType: delete waitForDeletion: true objects: - kind: Pod labelSelector: {kube-burner-job: api-intensive} - name: remove-services qps: 2 burst: 2 jobType: delete waitForDeletion: true objects: - kind: Service labelSelector: {kube-burner-job: api-intensive} - name: remove-configmaps-secrets qps: 2 burst: 2 jobType: delete objects: - kind: ConfigMap labelSelector: {kube-burner-job: api-intensive} - kind: Secret labelSelector: {kube-burner-job: api-intensive} ``` `./templates/deployment.yaml`: ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: api-intensive-{{.Replica}} labels: group: load svc: api-intensive-{{.Replica}} spec: replicas: 1 selector: matchLabels: name: api-intensive-{{.Replica}} template: metadata: labels: group: load name: api-intensive-{{.Replica}} spec: containers: - image: registry.k8s.io/pause:3.1 name: api-intensive-{{.Replica}} resources: requests: cpu: 10m memory: 10M volumeMounts: - name: configmap mountPath: /var/configmap - name: secret mountPath: /var/secret dnsPolicy: Default terminationGracePeriodSeconds: 1 # Add not-ready/unreachable tolerations for 15 minutes so that node # failure doesn't trigger pod deletion. tolerations: - key: "node.kubernetes.io/not-ready" operator: "Exists" effect: "NoExecute" tolerationSeconds: 900 - key: "node.kubernetes.io/unreachable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 900 volumes: - name: configmap configMap: name: configmap-{{.Replica}} - name: secret secret: secretName: secret-{{.Replica}} ``` `./templates/deployment_patch_add_pod_2.yaml`: ```yaml kind: Deployment apiVersion: apps/v1 spec: template: spec: containers: - image: registry.k8s.io/pause:3.1 name: api-intensive-2 resources: requests: cpu: 10m memory: 10M ``` `./templates/service.yaml`: ```yaml apiVersion: v1 kind: Service metadata: name: service-{{.Replica}} spec: selector: name: api-intensive-{{.Replica}} ports: - port: 80 targetPort: 80 ``` `./templates/deployment_patch_add_label.yaml`: ```yaml kind: Deployment apiVersion: apps/v1 metadata: labels: new_key_{{.Iteration}}: new_value_{{.Iteration}} ``` `./templates/deployment_patch_add_label.json`: ```yaml [ { "op": "add", "path": "/metadata/labels/new_key", "value": "new_value" } ] ``` `./templates/configmap.yaml`: ```yaml apiVersion: v1 kind: ConfigMap metadata: name: configmap-{{.Replica}} data: data.yaml: |- a: 1 b: 2 c: 3 ``` `./templates/secret.yaml`: ```yaml apiVersion: v1 kind: Secret metadata: name: secret-{{.Replica}} type: Opaque data: password: Zm9vb29vb29vb29vb29vbwo= ```GET constantly the endpoint
In the test pod we created earlier, run the following command which will curl the hello-700.example.com ingress constantly until 404 is returned.
While this is running, in a different shell, perform a rollout restart of the ingress-nginx controller.
It may take a couple of attempts of rolling out the deployment, but eventually you should see the loop in the test pod break and something similar to the following stderr and body printed:
Interestingly you can't see the 404 in the logs of either the new or old nginx pod in this reproduction.\ This is different than what we see in our production cluster, where the 404s are present in the logs in the newly created ingress-nginx pod.
Anything else we need to know:
Here's a breakdown of what I think some of the details around the root-cause of this bug in the code are:
The health check here basically just checks that nginx is running (which it will be very early on) and that the /is-dynamic-lb-initialized path returns with 2xx:\ https://github.com/kubernetes/ingress-nginx/blob/0edf16ff6bff89bd61750c38558b3bf801ec5ced/internal/ingress/controller/checker.go#L63-L66
The is-dynamic-lb-initialized location is handled by this Lua module, which is just checking if any backends are configured.\ https://github.com/kubernetes/ingress-nginx/blob/05eda3db8b23497d1da74013d3180780d50b0767/rootfs/etc/nginx/lua/nginx/ngx_conf_is_dynamic_lb_initialized.lua#L2-L6
This will basically always be true after the first reload, as as soon as there is at least one ingress in the cache, it will detect a difference and trigger a reload with the new backends configuration:\ https://github.com/kubernetes/ingress-nginx/blob/05eda3db8b23497d1da74013d3180780d50b0767/internal/ingress/controller/controller.go#L195-L221\ Which calls the OnUpdate here:\ https://github.com/kubernetes/ingress-nginx/blob/0edf16ff6bff89bd61750c38558b3bf801ec5ced/internal/ingress/controller/nginx.go#L749
My suspicion is that the cached ingresses in k8sStore do not represent the full state initially, but rather include ingresses returned from the API server in the first paginated response(s): https://github.com/kubernetes/ingress-nginx/blob/05eda3db8b23497d1da74013d3180780d50b0767/internal/ingress/controller/store/store.go#L1102
This can be validated by inspecting the number of ingresses returned by this function during startup when a large number of ingresses are present.
A potential solution would be to reject a reload of Nginx until we're sure that the cache is fully populated on the initial sync.