kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.52k stars 8.25k forks source link

Connection timed out for controller Pods. #11426

Open alifiroozi80 opened 5 months ago

alifiroozi80 commented 5 months ago

What happened:

After installing the Ingress-Nginx, it's Pods timeout.

NGINX Ingress controller version:

$ /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.10.1
  Build:         4fb5aac1dd3669daa3a14d9de3e3cdb371b4c518
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.25.3

-------------------------------------------------------------------------------

Kubernetes version:

$ kubectl version
Client Version: v1.30.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.1

Environment:

PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ uname -a
Linux master-1 5.15.0-107-generic #117-Ubuntu SMP Fri Apr 26 12:26:49 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl get nodes -o wide
NAME       STATUS   ROLES           AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
master-1   Ready    control-plane   5d21h   v1.30.1   10.0.10.215   <none>        Ubuntu 22.04.4 LTS   5.15.0-107-generic   containerd://1.7.2
master-2   Ready    control-plane   5d21h   v1.30.1   10.0.10.35    <none>        Ubuntu 22.04.4 LTS   5.15.0-107-generic   containerd://1.7.2
master-3   Ready    control-plane   5d21h   v1.30.1   10.0.10.14    <none>        Ubuntu 22.04.4 LTS   5.15.0-107-generic   containerd://1.7.2
worker-1   Ready    <none>          5d21h   v1.30.1   10.0.10.36    <none>        Ubuntu 22.04.4 LTS   5.15.0-107-generic   containerd://1.7.2
worker-2   Ready    <none>          5d21h   v1.30.1   10.0.10.152   <none>        Ubuntu 22.04.4 LTS   5.15.0-107-generic   containerd://1.7.2
worker-3   Ready    <none>          5d21h   v1.30.1   10.0.10.175   <none>        Ubuntu 22.04.4 LTS   5.15.0-107-generic   containerd://1.7.2
$ helm ls -A | grep -i ingress
ingress-nginx   ingress-nginx   1           2024-06-05 09:25:13.228270574 +0000 UTC deployed    ingress-nginx-4.10.1    1.10.1 
$ helm -n ingress-nginx get values ingress-nginx
USER-SUPPLIED VALUES:
controller:
  kind: DaemonSet
  metrics:
    enabled: true
  service:
    nodePorts:
      http: 32080
      https: 32443
    ports:
      http: 80
      https: 443
    targetPorts:
      http: http
      https: https
    type: NodePort
$ kubectl describe ingressclasses
Name:         nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.10.1
              helm.sh/chart=ingress-nginx-4.10.1
Annotations:  meta.helm.sh/release-name: ingress-nginx
              meta.helm.sh/release-namespace: ingress-nginx
Controller:   k8s.io/ingress-nginx
Events:       <none>
$ kubectl -n ingress-nginx get all -o wide
NAME                                 READY   STATUS    RESTARTS      AGE   IP                NODE       NOMINATED NODE   READINESS GATES
pod/ingress-nginx-controller-dd5g9   1/1     Running   1 (14m ago)   22m   192.168.226.78    worker-1   <none>           <none>
pod/ingress-nginx-controller-fh84s   1/1     Running   1 (14m ago)   22m   192.168.97.203    worker-3   <none>           <none>
pod/ingress-nginx-controller-nnvv7   1/1     Running   1 (14m ago)   22m   192.168.133.213   worker-2   <none>           <none>

NAME                                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE   SELECTOR
service/ingress-nginx-controller             NodePort    10.98.62.6      <none>        80:32080/TCP,443:32443/TCP   22m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
service/ingress-nginx-controller-admission   ClusterIP   10.105.92.109   <none>        443/TCP                      22m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
service/ingress-nginx-controller-metrics     ClusterIP   10.96.106.43    <none>        10254/TCP                    22m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx

NAME                                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE   CONTAINERS   IMAGES                                                                                                                     SELECTOR
daemonset.apps/ingress-nginx-controller   3         3         3       3            3           kubernetes.io/os=linux   22m   controller   registry.k8s.io/ingress-nginx/controller:v1.10.1@sha256:e24f39d3eed6bcc239a56f20098878845f62baa34b9f2be2fd2c38ce9fb0f29e   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
$ kubectl -n ingress-nginx describe pod/ingress-nginx-controller-dd5g9 
Name:             ingress-nginx-controller-dd5g9
Namespace:        ingress-nginx
Priority:         0
Service Account:  ingress-nginx
Node:             worker-1/10.0.10.36
Start Time:       Wed, 05 Jun 2024 09:25:17 +0000
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress-nginx
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=ingress-nginx
                  app.kubernetes.io/version=1.10.1
                  controller-revision-hash=74798dbdc7
                  helm.sh/chart=ingress-nginx-4.10.1
                  pod-template-generation=1
Annotations:      cni.projectcalico.org/containerID: 2befd8908088683891fe814873bb1d217d2cc41c8d73bcd2486c5e4a43ae5337
                  cni.projectcalico.org/podIP: 192.168.226.78/32
                  cni.projectcalico.org/podIPs: 192.168.226.78/32
Status:           Running
IP:               192.168.226.78
IPs:
  IP:           192.168.226.78
Controlled By:  DaemonSet/ingress-nginx-controller
Containers:
  controller:
    Container ID:    containerd://596956676e18cb14b89f383a21aeba864ddb5d93eb9912ad0b8c89be19c11d78
    Image:           registry.k8s.io/ingress-nginx/controller:v1.10.1@sha256:e24f39d3eed6bcc239a56f20098878845f62baa34b9f2be2fd2c38ce9fb0f29e
    Image ID:        registry.k8s.io/ingress-nginx/controller@sha256:e24f39d3eed6bcc239a56f20098878845f62baa34b9f2be2fd2c38ce9fb0f29e
    Ports:           80/TCP, 443/TCP, 10254/TCP, 8443/TCP
    Host Ports:      0/TCP, 0/TCP, 0/TCP, 0/TCP
    SeccompProfile:  RuntimeDefault
    Args:
      /nginx-ingress-controller
      --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
      --election-id=ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
    State:          Running
      Started:      Wed, 05 Jun 2024 09:33:45 +0000
    Last State:     Terminated
      Reason:       Unknown
      Exit Code:    255
      Started:      Wed, 05 Jun 2024 09:25:17 +0000
      Finished:     Wed, 05 Jun 2024 09:32:51 +0000
    Ready:          True
    Restart Count:  1
    Requests:
      cpu:      100m
      memory:   90Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-controller-dd5g9 (v1:metadata.name)
      POD_NAMESPACE:  ingress-nginx (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-km2hj (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-nginx-admission
    Optional:    false
  kube-api-access-km2hj:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason          Age                From                      Message
  ----     ------          ----               ----                      -------
  Normal   Scheduled       22m                default-scheduler         Successfully assigned ingress-nginx/ingress-nginx-controller-dd5g9 to worker-1
  Normal   Pulled          22m                kubelet                   Container image "registry.k8s.io/ingress-nginx/controller:v1.10.1@sha256:e24f39d3eed6bcc239a56f20098878845f62baa34b9f2be2fd2c38ce9fb0f29e" already present on machine
  Normal   Created         22m                kubelet                   Created container controller
  Normal   Started         22m                kubelet                   Started container controller
  Normal   RELOAD          22m                nginx-ingress-controller  NGINX reload triggered due to a change in configuration
  Normal   SandboxChanged  14m (x2 over 14m)  kubelet                   Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          14m                kubelet                   Container image "registry.k8s.io/ingress-nginx/controller:v1.10.1@sha256:e24f39d3eed6bcc239a56f20098878845f62baa34b9f2be2fd2c38ce9fb0f29e" already present on machine
  Normal   Created         14m                kubelet                   Created container controller
  Normal   Started         14m                kubelet                   Started container controller
  Warning  Unhealthy       13m (x2 over 14m)  kubelet                   Liveness probe failed: Get "http://192.168.226.78:10254/healthz": dial tcp 192.168.226.78:10254: connect: connection refused
  Warning  Unhealthy       13m (x2 over 14m)  kubelet                   Readiness probe failed: Get "http://192.168.226.78:10254/healthz": dial tcp 192.168.226.78:10254: connect: connection refused
  Normal   RELOAD          13m                nginx-ingress-controller  NGINX reload triggered due to a change in configuration
$ kubectl -n ingress-nginx describe service/ingress-nginx-controller 
Name:                     ingress-nginx-controller
Namespace:                ingress-nginx
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.10.1
                          helm.sh/chart=ingress-nginx-4.10.1
Annotations:              meta.helm.sh/release-name: ingress-nginx
                          meta.helm.sh/release-namespace: ingress-nginx
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.98.62.6
IPs:                      10.98.62.6
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  32080/TCP
Endpoints:                192.168.133.213:80,192.168.226.78:80,192.168.97.203:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  32443/TCP
Endpoints:                192.168.133.213:443,192.168.226.78:443,192.168.97.203:443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

The current state seems fine.

The problem is now that it should be responsible for those NodePorts. but with the first request, it works, and for the other request, it doesn't.:

$ curl -vvv http://10.0.10.36:32080
*   Trying 10.0.10.36:32080...
* Connected to 10.0.10.36 (10.0.10.36) port 32080 (#0)
> GET / HTTP/1.1
> Host: 10.0.10.36:32080
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Date: Wed, 05 Jun 2024 09:51:11 GMT
< Content-Type: text/html
< Content-Length: 146
< Connection: keep-alive
< 
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
* Connection #0 to host 10.0.10.36 left intact
$ curl -vvv http://10.0.10.36:32080
*   Trying 10.0.10.36:32080...
* connect to 10.0.10.36 port 32080 failed: Connection timed out
* Failed to connect to 10.0.10.36 port 32080 after 130260 ms: Connection timed out
* Closing connection 0
curl: (28) Failed to connect to 10.0.10.36 port 32080 after 130260 ms: Connection timed out

How to reproduce this issue:

Install the ingress controller

helm install --create-namespace ingress-nginx -n ingress-nginx -f ~/nginx-values.yaml ./ingress-nginx/charts/ingress-nginx

Anything else we need to know:

Here are the endpoints as well:

$ kubectl -n ingress-nginx get ep
NAME                                 ENDPOINTS                                                               AGE
ingress-nginx-controller             192.168.133.215:443,192.168.226.81:443,192.168.97.205:443 + 3 more...   30m
ingress-nginx-controller-admission   192.168.133.215:8443,192.168.226.81:8443,192.168.97.205:8443            30m
ingress-nginx-controller-metrics     192.168.133.215:10254,192.168.226.81:10254,192.168.97.205:10254         30m
$ kubectl -n ingress-nginx describe ep/ingress-nginx-controller
Name:         ingress-nginx-controller
Namespace:    ingress-nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.10.1
              helm.sh/chart=ingress-nginx-4.10.1
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2024-06-05T09:50:36Z
Subsets:
  Addresses:          192.168.133.215,192.168.226.81,192.168.97.205
  NotReadyAddresses:  <none>
  Ports:
    Name   Port  Protocol
    ----   ----  --------
    https  443   TCP
    http   80    TCP

Events:  <none>
$ kubectl -n ingress-nginx get pod -o wide
NAME                             READY   STATUS    RESTARTS   AGE     IP                NODE       NOMINATED NODE   READINESS GATES
ingress-nginx-controller-s6qdg   1/1     Running   0          5m18s   192.168.133.215   worker-2   <none>           <none>
ingress-nginx-controller-xpld6   1/1     Running   0          5m43s   192.168.97.205    worker-3   <none>           <none>
ingress-nginx-controller-zwlrg   1/1     Running   0          5m31s   192.168.226.81    worker-1   <none>           <none>

As you see, the endpoints are correct.

k8s-ci-robot commented 5 months ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 5 months ago

/remove-kind bug /kind support /triage needs-information

The default install of the controller is not a daemonset. Can you test a default install with no customized values file.

The daemonset use-case and NodePort use-case require advanced awareness. So the info provided also has to be much much much much more granular and wider-scoped than what is asked in a new bug-report template. So try a default install without any custom values file and post the data from that. Next use a vlaues file of your liking or what you posted and then paste data from that.

github-actions[bot] commented 4 months ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.