nginx-ingress routes traffic to not-ready pods

markjgardner commented 1 year ago

Describe the bug Ingress continues to route traffic to backend pods that are marked as not ready due to failing readinessProbe.

To Reproduce

Install nginx-ingress chart on cluster: helm install ingress nginx-ingress

Deploy the following workload, service and ingress:

apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-app
labels:
app: simple-app
spec:
replicas: 3
selector:
matchLabels:
  app: simple-app
template:
metadata:
  labels:
    app: simple-app
spec:
  containers:
  - name: aspnet
    image: mcr.microsoft.com/dotnet/samples:aspnetapp
    ports:
    - containerPort: 80
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 3
      periodSeconds: 3
---
apiVersion: v1
kind: Service
metadata:
name: simple-app
spec:
selector:
app: simple-app
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: simpleapp-ingress
spec:
ingressClassName: nginx
rules:
- host: <ingress-ip>.nip.io
http:
  paths:
  - backend:
      service:
        name: simple-app
        port:
          number: 80
    path: /
    pathType: Prefix

exec into the three pods and touch /tmp/healthy to pass the readinessProbe
Validate the ingress routes traffic to each pod by refreshing the page a few times
exec into on of the pods and rm /tmp/healthy and observe pod becomes not ready
Describe the service and note the pod has been removed from the backends
Refresh page (clear cache or use new session just to be safe) and observe requests are still routing to not-ready pod

Expected behavior Requests should never route to the not-ready pod.

Your environment

Version of the Ingress Controller == 3.0.1
Version of Kubernetes == 1.25.5
Kubernetes platform == AKS
Using NGINX

Additional context This behavior is not present in https://kubernetes.github.io/ingress-nginx (requests are never routed to not-ready pods)

github-actions[bot] commented 1 year ago

Hi @markjgardner thanks for reporting!

Be sure to check out the docs while you wait for a human to take a look at this :slightly_smiling_face:

Cheers!

vepatel commented 1 year ago

hi @markjgardner Version of the Ingress Controller == 4.4.2 looks like https://github.com/kubernetes/ingress-nginx/releases/tag/helm-chart-4.4.2 to me, latest release for this project is 3.0.1

markjgardner commented 1 year ago

Whoops, sorry you are correct @vepatel I was running 3.0.1 of nginx-ingress when I replicated the failure scenario. I just grabbed the version off the wrong ingress when reporting the issue. Apologies.

vepatel commented 1 year ago

Can you please provide the pod logs, output from kubectl get pods -n <namespace> and describe output of your daemonset/deployment?

markjgardner commented 1 year ago

Here you go @vepatel

$> k get po
NAME                                           READY   STATUS    RESTARTS   AGE
nginx-ingress-nginx-ingress-76d5956b79-p8gf2   1/1     Running   0          4m36s
simple-app-6d58c497f5-qg87w                    1/1     Running   0          74m
simple-app-6d58c497f5-r5bm5                    1/1     Running   0          74m
simple-app-6d58c497f5-ztbbk                    0/1     Running   0          74m

$> k describe deploy simple-app
Name:                   simple-app
Namespace:              default
CreationTimestamp:      Wed, 08 Feb 2023 13:37:28 +0000
Labels:                 app=simple-app
Annotations:            deployment.kubernetes.io/revision: 4
Selector:               app=simple-app
Replicas:               3 desired | 3 updated | 3 total | 2 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=simple-app
  Containers:
   aspnet:
    Image:        mcr.microsoft.com/dotnet/samples:aspnetapp
    Port:         80/TCP
    Host Port:    0/TCP
    Readiness:    exec [cat /tmp/healthy] delay=3s timeout=1s period=3s #success=1 #failure=3
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Progressing    True    NewReplicaSetAvailable
  Available      False   MinimumReplicasUnavailable
OldReplicaSets:  <none>
NewReplicaSet:   simple-app-6d58c497f5 (3/3 replicas created)
Events:          <none>

vepatel commented 1 year ago

sorry meant you Ingress controller pod logs and Ingress Controller deployment describe output

markjgardner commented 1 year ago

@vepatel

$> k describe deploy nginx-ingress-nginx-ingress
Name:                   nginx-ingress-nginx-ingress
Namespace:              default
CreationTimestamp:      Wed, 08 Feb 2023 17:48:15 +0000
Labels:                 app.kubernetes.io/instance=nginx-ingress
                        app.kubernetes.io/managed-by=Helm
                        app.kubernetes.io/name=nginx-ingress-nginx-ingress
                        helm.sh/chart=nginx-ingress-0.16.1
Annotations:            deployment.kubernetes.io/revision: 1
                        meta.helm.sh/release-name: nginx-ingress
                        meta.helm.sh/release-namespace: default
Selector:               app=nginx-ingress-nginx-ingress
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app=nginx-ingress-nginx-ingress
  Annotations:      prometheus.io/port: 9113
                    prometheus.io/scheme: http
                    prometheus.io/scrape: true
  Service Account:  nginx-ingress-nginx-ingress
  Containers:
   nginx-ingress-nginx-ingress:
    Image:       nginx/nginx-ingress:3.0.1
    Ports:       80/TCP, 443/TCP, 9113/TCP, 8081/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP
    Args:
      -nginx-plus=false
      -nginx-reload-timeout=60000
      -enable-app-protect=false
      -enable-app-protect-dos=false
      -nginx-configmaps=$(POD_NAMESPACE)/nginx-ingress-nginx-ingress
      -default-server-tls-secret=$(POD_NAMESPACE)/nginx-ingress-nginx-ingress-default-server-tls
      -ingress-class=nginx
      -health-status=false
      -health-status-uri=/nginx-health
      -nginx-debug=false
      -v=1
      -nginx-status=true
      -nginx-status-port=8080
      -nginx-status-allow-cidrs=127.0.0.1
      -report-ingress-status
      -external-service=nginx-ingress-nginx-ingress
      -enable-leader-election=true
      -leader-election-lock-name=nginx-ingress-nginx-ingress-leader-election
      -enable-prometheus-metrics=true
      -prometheus-metrics-listen-port=9113
      -prometheus-tls-secret=
      -enable-service-insight=false
      -service-insight-listen-port=9114
      -service-insight-tls-secret=
      -enable-custom-resources=true
      -enable-snippets=false
      -include-year=false
      -disable-ipv6=false
      -enable-tls-passthrough=false
      -enable-preview-policies=false
      -enable-cert-manager=false
      -enable-oidc=false
      -enable-external-dns=false
      -ready-status=true
      -ready-status-port=8081
      -enable-latency-metrics=false
    Requests:
      cpu:      100m
      memory:   128Mi
    Readiness:  http-get http://:readiness-port/nginx-ready delay=0s timeout=1s period=1s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:   (v1:metadata.namespace)
      POD_NAME:        (v1:metadata.name)
    Mounts:           <none>
  Volumes:            <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   nginx-ingress-nginx-ingress-76d5956b79 (1/1 replicas created)
Events:          <none>

$> k logs nginx-ingress-nginx-ingress-76d5956b79-p8gf2
NGINX Ingress Controller Version=3.0.1 Commit=051928835aaa724a9718e22ce2f7acc8a323317f Date=2023-01-25T23:32:26Z DirtyState=false Arch=linux/amd64 Go=go1.19.5
I0208 17:48:19.694052       1 flags.go:294] Starting with flags: ["-nginx-plus=false" "-nginx-reload-timeout=60000" "-enable-app-protect=false" "-enable-app-protect-dos=false" "-nginx-configmaps=default/nginx-ingress-nginx-ingress" "-default-server-tls-secret=default/nginx-ingress-nginx-ingress-default-server-tls" "-ingress-class=nginx" "-health-status=false" "-health-status-uri=/nginx-health" "-nginx-debug=false" "-v=1" "-nginx-status=true" "-nginx-status-port=8080" "-nginx-status-allow-cidrs=127.0.0.1" "-report-ingress-status" "-external-service=nginx-ingress-nginx-ingress" "-enable-leader-election=true" "-leader-election-lock-name=nginx-ingress-nginx-ingress-leader-election" "-enable-prometheus-metrics=true" "-prometheus-metrics-listen-port=9113" "-prometheus-tls-secret=" "-enable-service-insight=false" "-service-insight-listen-port=9114" "-service-insight-tls-secret=" "-enable-custom-resources=true" "-enable-snippets=false" "-include-year=false" "-disable-ipv6=false" "-enable-tls-passthrough=false" "-enable-preview-policies=false" "-enable-cert-manager=false" "-enable-oidc=false" "-enable-external-dns=false" "-ready-status=true" "-ready-status-port=8081" "-enable-latency-metrics=false"]
I0208 17:48:19.731261       1 main.go:227] Kubernetes version: 1.25.5
I0208 17:48:19.742156       1 main.go:373] Using nginx version: nginx/1.23.3
2023/02/08 17:48:19 [notice] 17#17: using the "epoll" event method
2023/02/08 17:48:19 [notice] 17#17: nginx/1.23.3
2023/02/08 17:48:19 [notice] 17#17: built by gcc 10.2.1 20210110 (Debian 10.2.1-6) 
2023/02/08 17:48:19 [notice] 17#17: OS: Linux 5.4.0-1101-azure
2023/02/08 17:48:19 [notice] 17#17: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2023/02/08 17:48:19 [notice] 17#17: start worker processes
2023/02/08 17:48:19 [notice] 17#17: start worker process 18
2023/02/08 17:48:19 [notice] 17#17: start worker process 19
2023/02/08 17:48:19 [notice] 17#17: start worker process 20
2023/02/08 17:48:19 [notice] 17#17: start worker process 21
I0208 17:48:19.769860       1 listener.go:54] Starting Prometheus listener on: :9113/metrics
I0208 17:48:19.771245       1 leaderelection.go:248] attempting to acquire leader lease default/nginx-ingress-nginx-ingress-leader-election...
I0208 17:48:19.828558       1 leaderelection.go:258] successfully acquired lease default/nginx-ingress-nginx-ingress-leader-election
I0208 17:48:20.272070       1 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"simpleapp-ingress", UID:"fdec2bf0-cdee-40a5-9272-0db88890b5fd", APIVersion:"networking.k8s.io/v1", ResourceVersion:"63733244", FieldPath:""}): type: 'Normal' reason: 'AddedOrUpdated' Configuration for default/simpleapp-ingress was added or updated 
I0208 17:48:20.284291       1 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"simpleapp-ingress", UID:"fdec2bf0-cdee-40a5-9272-0db88890b5fd", APIVersion:"networking.k8s.io/v1", ResourceVersion:"63733244", FieldPath:""}): type: 'Normal' reason: 'AddedOrUpdated' Configuration for default/simpleapp-ingress was added or updated 
I0208 17:48:20.293621       1 event.go:285] Event(v1.ObjectReference{Kind:"Secret", Namespace:"default", Name:"nginx-ingress-nginx-ingress-default-server-tls", UID:"fb6d1479-2057-4418-a061-343bc28370bf", APIVersion:"v1", ResourceVersion:"63761508", FieldPath:""}): type: 'Normal' reason: 'Updated' the special Secret default/nginx-ingress-nginx-ingress-default-server-tls was updated
2023/02/08 17:48:20 [notice] 17#17: signal 1 (SIGHUP) received from 26, reconfiguring
2023/02/08 17:48:20 [notice] 17#17: reconfiguring
2023/02/08 17:48:20 [notice] 17#17: using the "epoll" event method
2023/02/08 17:48:20 [notice] 17#17: start worker processes
2023/02/08 17:48:20 [notice] 17#17: start worker process 27
2023/02/08 17:48:20 [notice] 17#17: start worker process 28
2023/02/08 17:48:20 [notice] 17#17: start worker process 29
2023/02/08 17:48:20 [notice] 17#17: start worker process 30
2023/02/08 17:48:20 [notice] 18#18: gracefully shutting down
2023/02/08 17:48:20 [notice] 19#19: gracefully shutting down
2023/02/08 17:48:20 [notice] 18#18: exiting
2023/02/08 17:48:20 [notice] 19#19: exiting
2023/02/08 17:48:20 [notice] 19#19: exit
2023/02/08 17:48:20 [notice] 18#18: exit
2023/02/08 17:48:20 [notice] 20#20: gracefully shutting down
2023/02/08 17:48:20 [notice] 20#20: exiting
2023/02/08 17:48:20 [notice] 20#20: exit
2023/02/08 17:48:20 [notice] 21#21: gracefully shutting down
2023/02/08 17:48:20 [notice] 21#21: exiting
2023/02/08 17:48:20 [notice] 21#21: exit
I0208 17:48:20.436166       1 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"default", Name:"nginx-ingress-nginx-ingress", UID:"ab301236-a36d-438c-8204-db6cb8f8d03e", APIVersion:"v1", ResourceVersion:"63761510", FieldPath:""}): type: 'Normal' reason: 'Updated' Configuration from default/nginx-ingress-nginx-ingress was updated 
I0208 17:48:20.436196       1 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"simpleapp-ingress", UID:"fdec2bf0-cdee-40a5-9272-0db88890b5fd", APIVersion:"networking.k8s.io/v1", ResourceVersion:"63761585", FieldPath:""}): type: 'Normal' reason: 'AddedOrUpdated' Configuration for default/simpleapp-ingress was added or updated 
2023/02/08 17:48:20 [notice] 17#17: signal 17 (SIGCHLD) received from 18
2023/02/08 17:48:20 [notice] 17#17: worker process 18 exited with code 0
2023/02/08 17:48:20 [notice] 17#17: signal 29 (SIGIO) received
2023/02/08 17:48:20 [notice] 17#17: signal 17 (SIGCHLD) received from 19
2023/02/08 17:48:20 [notice] 17#17: worker process 19 exited with code 0
2023/02/08 17:48:20 [notice] 17#17: worker process 20 exited with code 0
2023/02/08 17:48:20 [notice] 17#17: worker process 21 exited with code 0
2023/02/08 17:48:20 [notice] 17#17: signal 29 (SIGIO) received
2023/02/08 17:48:20 [notice] 17#17: signal 17 (SIGCHLD) received from 20
2023/02/08 17:49:01 [notice] 17#17: signal 1 (SIGHUP) received from 32, reconfiguring
2023/02/08 17:49:01 [notice] 17#17: reconfiguring
2023/02/08 17:49:01 [notice] 17#17: using the "epoll" event method
2023/02/08 17:49:01 [notice] 17#17: start worker processes
2023/02/08 17:49:01 [notice] 17#17: start worker process 33
2023/02/08 17:49:01 [notice] 17#17: start worker process 34
2023/02/08 17:49:01 [notice] 17#17: start worker process 35
2023/02/08 17:49:01 [notice] 17#17: start worker process 36
2023/02/08 17:49:01 [notice] 27#27: gracefully shutting down
2023/02/08 17:49:01 [notice] 28#28: gracefully shutting down
2023/02/08 17:49:01 [notice] 30#30: gracefully shutting down
2023/02/08 17:49:01 [notice] 30#30: exiting
2023/02/08 17:49:01 [notice] 27#27: exiting
2023/02/08 17:49:01 [notice] 29#29: gracefully shutting down
2023/02/08 17:49:01 [notice] 29#29: exiting
2023/02/08 17:49:01 [notice] 30#30: exit
2023/02/08 17:49:01 [notice] 27#27: exit
2023/02/08 17:49:01 [notice] 29#29: exit
2023/02/08 17:49:01 [notice] 28#28: exiting
2023/02/08 17:49:01 [notice] 28#28: exit
I0208 17:49:01.490268       1 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"simpleapp-ingress", UID:"fdec2bf0-cdee-40a5-9272-0db88890b5fd", APIVersion:"networking.k8s.io/v1", ResourceVersion:"63761898", FieldPath:""}): type: 'Normal' reason: 'AddedOrUpdated' Configuration for default/simpleapp-ingress was added or updated 
2023/02/08 17:49:01 [notice] 17#17: signal 17 (SIGCHLD) received from 30
2023/02/08 17:49:01 [notice] 17#17: worker process 27 exited with code 0
2023/02/08 17:49:01 [notice] 17#17: worker process 28 exited with code 0
2023/02/08 17:49:01 [notice] 17#17: worker process 30 exited with code 0
2023/02/08 17:49:01 [notice] 17#17: signal 29 (SIGIO) received
2023/02/08 17:49:01 [notice] 17#17: signal 17 (SIGCHLD) received from 28
2023/02/08 17:49:01 [notice] 17#17: signal 17 (SIGCHLD) received from 29
2023/02/08 17:49:01 [notice] 17#17: worker process 29 exited with code 0
2023/02/08 17:49:01 [notice] 17#17: signal 29 (SIGIO) received
166.198.116.168 - - [08/Feb/2023:17:50:43 +0000] "GET / HTTP/1.1" 200 3819 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
2023/02/08 17:50:43 [warn] 33#33: *5 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/1/00/0000000001 while reading upstream, client: 166.198.116.168, server: 20.121.81.115.nip.io, request: "GET /lib/bootstrap/dist/css/bootstrap.min.css HTTP/1.1", upstream: "http://10.1.0.104:80/lib/bootstrap/dist/css/bootstrap.min.css", host: "20.121.81.115.nip.io", referrer: "http://20.121.81.115.nip.io/"
166.198.116.168 - - [08/Feb/2023:17:50:43 +0000] "GET /css/site.css?v=AKvNjO3dCPPS0eSU1Ez8T2wI280i08yGycV9ndytL-c HTTP/1.1" 200 194 "http://20.121.81.115.nip.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:50:43 +0000] "GET /aspnetapp.styles.css?v=dmaWIJMtYHjABWevZ_2Q8P4v1xrVPOBMkiL86DlKmX8 HTTP/1.1" 200 1077 "http://20.121.81.115.nip.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:50:43 +0000] "GET /js/site.js?v=4q1jwFhaPaZgr8WAUSrux6hAuh0XDg9kPS3xIVq36I0 HTTP/1.1" 200 230 "http://20.121.81.115.nip.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:50:43 +0000] "GET /lib/bootstrap/dist/css/bootstrap.min.css HTTP/1.1" 200 162720 "http://20.121.81.115.nip.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:50:43 +0000] "GET /lib/bootstrap/dist/js/bootstrap.bundle.min.js HTTP/1.1" 200 78468 "http://20.121.81.115.nip.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:50:44 +0000] "GET /lib/jquery/dist/jquery.min.js HTTP/1.1" 200 89476 "http://20.121.81.115.nip.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:50:44 +0000] "GET /favicon.ico HTTP/1.1" 200 5430 "http://20.121.81.115.nip.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:52:28 +0000] "GET / HTTP/1.1" 200 3818 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:52:29 +0000] "GET / HTTP/1.1" 200 3820 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:52:30 +0000] "GET / HTTP/1.1" 200 3818 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:52:31 +0000] "GET / HTTP/1.1" 200 3819 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:52:31 +0000] "GET / HTTP/1.1" 200 3819 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:52:32 +0000] "GET / HTTP/1.1" 200 3820 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"
166.198.116.168 - - [08/Feb/2023:17:52:32 +0000] "GET / HTTP/1.1" 200 3819 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78" "-"

markjgardner commented 1 year ago

@vepatel I did some more forensics on this and it sure looks like nginx-ingress is bypassing the clusterIp service and proxying directly to the pod backends. This would explain why those pods aren't being removed when the readiness probe fails. My question is "why is it doing this?"


$> cat /etc/ngin/conf.d/default-simpleapp-ingress.conf

upstream default-simpleapp-ingress-<someip>.nip.io-simple-app-80 {
        zone default-simpleapp-ingress-<someip>.nip.io-simple-app-80 256k;
        random two least_conn;

        server 10.1.0.26:80 max_fails=1 fail_timeout=10s max_conns=0;
        server 10.1.0.10:80 max_fails=1 fail_timeout=10s max_conns=0;
        server 10.1.0.31:80 max_fails=1 fail_timeout=10s max_conns=0;
}

server {
        listen 80;
        listen [::]:80;

        server_tokens on;

        server_name <someip>.nip.io;

        set $resource_type "ingress";
        set $resource_name "simpleapp-ingress";
        set $resource_namespace "default";

        location / {
                set $service "simple-app";

                proxy_http_version 1.1;

                proxy_connect_timeout 60s;
                proxy_read_timeout 60s;
                proxy_send_timeout 60s;
                client_max_body_size 1m;
                proxy_set_header Host $host;
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Host $host;
                proxy_set_header X-Forwarded-Port $server_port;
                proxy_set_header X-Forwarded-Proto $scheme;
                proxy_buffering on;

                proxy_pass http://default-simpleapp-ingress-<someip>.nip.io-simple-app-80;
        }
}

jasonwilliams14 commented 1 year ago

@markjgardner Thanks for reporting this to us. We are going to take a look and review this on our side.

it sure looks like nginx-ingress is bypassing the clusterIp service and proxying directly to the pod backends

You are correct. We talk directly to the endpoint IPs which we gather from the service. This allows us to take advantage of things in NGINX+ like active health checking capabilities, giving NGINX ingress controller configurable options to customers. You can see some of the options supported.

https://docs.nginx.com/nginx-ingress-controller/configuration/virtualserver-and-virtualserverroute-resources/#upstreamhealthcheck

We will review and update this issue when we have finished our analysis. Thank you again.

brianehlert commented 1 year ago

The service endpoint is bypassed to provide true load balancing support across the backend service pods as well as to support behavior such as sticky sessions, cookie persistence, and other capabilities that are necessary for non-stateless services and many TCP (non HTTP) services. (there are a lot of non-stateless and TCP/UDP services out there)

individual pods can react and behave differently, and forwarding traffic direct to the individual pods allows the system to respond in a more natural way. It is possible across a cluster to have some pods on some nodes have a faster response and thus handle additional load. If the service endpoint IP is used, kube-proxy and its random-ish distribution behavior takes over and you rely on the infrastructure to wait for a pod to begin to fail before anything happens across the system.

NGINX is constantly monitoring the responsiveness of the backends to determine if they are still healthy and should be receiving traffic. This is actually more granular behavior awareness than a readiness probe might be. This also allows NGINX to inform you (the user) that particular pods are returning 500s or other response codes and which particular pod it is. This is in the Prometheus output of the NGINX Plus version of our implementation.

In the end it all about adding additional value to the system to give the end customer the best experience possible.

markjgardner commented 1 year ago

Ok, totally understand that this is intentional behavior. But if I'm understanding you correctly, there is still a bug as your healthchecks are failing to detect not-ready pods and remove them from the backend. Or there is some non-intuitive, non-default configuration that I am missing.

As for bypassing the k8s service... I don't know much about nginx+ so forgive my ignorance, if you don't actually need the loadbalancing provided by a typical k8s service, why not put a validation requirement on your ingress controller to require clusterIp: none on backing services? Seems like it would give you everything you need (ips for the backing pods) without getting in the way of your smarter ingress model. It would also act as a pretty clear flag to the uninitiated that there are non-conventional ingress semantics at play here.

brianehlert commented 1 year ago

if you don't actually need the loadbalancing provided by a typical k8s service, why not put a validation requirement on your ingress controller to require clusterIp: none on backing services?

We do this the other way around. We assume that you want the extra capabilities of load balancing controls, sticky sessions, smarter traffic distribution etc. that any proxy brings to the table. We give you the ability through our VirtualServer CRD to use the clusterIP for an upstream service on a service by service basis. Knowing that all services are not the same. https://docs.nginx.com/nginx-ingress-controller/configuration/virtualserver-and-virtualserverroute-resources/#upstream

This is also how this project is compatible with Linkerd, Istio, and Open Service Mesh. It is not necessary for NGINX Service Mesh as we function a bit differently under the hood.

This is available in both the free and paid editions - the resources are identical.

In regards to the behavior origonally described in this ticket. This is a gap in our implementation of the EndpointSlices API that we are taking care of. The Endpoints API behaved a bit differently and the ready status of the new API was missed. There will be a fix coming.

wc-s commented 1 year ago

Just want to say that we are observing this problem as well, but it sounds like you guys already reproduced this and don't need any more confirmation.

vepatel commented 1 year ago

hi @wc-s, yes we were able to reproduce the issue and a fix is in works

brianehlert commented 1 year ago

Patch is forthcoming. Thank you all! Sorry for any problems.

vepatel commented 1 year ago

@wc-s @markjgardner fix for this issue is available in main now, there will be a 3.0.2 release later.

wc-s commented 1 year ago

@vepatel @brianehlert

Thanks! We already downgraded to 2.4.2, will try out the fix when the corresponding helm chart is released.

vepatel commented 1 year ago

@wc-s @markjgardner new patch release 3.0.2 with the fix is now live, see: https://github.com/nginxinc/kubernetes-ingress/releases/tag/v3.0.2

markjgardner commented 1 year ago

Tested and verified. Thanks for the quick turnaround.

nginxinc / kubernetes-ingress

nginx-ingress routes traffic to not-ready pods #3534