Ingress controller processing ingresses with different ingressClassName when multiple ingress controllers are deployed in the same namespace - AWS EKS 1.27

mdellavedova commented 8 months ago

What happened:

I have 2 ingress controllers deployed in the same namespace, set up following the instructions in these documents: https://kubernetes.github.io/ingress-nginx/user-guide/k8s-122-migration/#i-cant-use-multiple-namespaces-what-should-i-do and https://kubernetes.github.io/ingress-nginx/user-guide/multiple-ingress/#multiple-ingress-controllers The ingresses work as expected but when I look at the logs for one ingress controller I can see multiple errors:

I0123 10:02:35.684672       7 store.go:436] "Ignoring ingress because of error while validating ingress class" ingress="omega/cs-05c36933-076c-490f-a23b-d6d5019d1cb2-api-gw" error="no object matching key \"ingress-controller-internal-nginx\" in local store"

suggesting that the ingress controller is considering ingresses that belong to the other ingress controller and vice-versa. This creates a high load on (one of) the ingress controller's pods causing it to restart. What you expected to happen: I would expect both ingress controllers to ignore ingresses which don't have their associated ingressClassName

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller Release: v1.8.1 Build: dc88dce9ea5e700f3301d16f971fa17c6cfe757d Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.21.6

I have also tried the latest available helm chart, which didn't help

NGINX Ingress controller Release: v1.9.5 Build: f503c4bb5fa7d857ad29e94970eb550c2bc00b7c Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.21.6

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"fa3d7990104d7c1f16943a67f11b154b71f6a132", GitTreeState:"clean", BuildDate:"2023-07-19T12:20:54Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27+", GitVersion:"v1.27.8-eks-8cb36c9", GitCommit:"fca3a8722c88c4dba573a903712a6feaf3c40a51", GitTreeState:"clean", BuildDate:"2023-11-22T21:52:13Z", GoVersion:"go1.20.11", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release):

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"

Kernel (e.g. uname -a): Linux ip-10-229-145-39.eu-west-1.compute.internal 5.10.198-187.748.amzn2.x86_64 #1 SMP Tue Oct 24 19:49:54 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Install tools:
- Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
Basic cluster related info:
- kubectl version

WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"fa3d7990104d7c1f16943a67f11b154b71f6a132", GitTreeState:"clean", BuildDate:"2023-07-19T12:20:54Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27+", GitVersion:"v1.27.8-eks-8cb36c9", GitCommit:"fca3a8722c88c4dba573a903712a6feaf3c40a51", GitTreeState:"clean", BuildDate:"2023-11-22T21:52:13Z", GoVersion:"go1.20.11", Compiler:"gc", Platform:"linux/amd64"}

kubectl get nodes -o wide

How was the ingress-nginx-controller installed:

If helm was used then please show output of helm ls -A | grep -i ingress
If helm was used then please show output of helm -n <ingresscontrollernamespace> get values <helmreleasename> the ingress controller(s) were installed using ArgoCD (which in terms uses helm). Helm values below:

nginx-public-nlb-tls

    controller:
      resources:
        requests:
          cpu: 100m
          memory: 500Mi
        limits:
          cpu: 2
          memory: 2000Mi
      hostNetwork: true
      ingressClassResource:
        name: nginx-public-nlb-tls
        enabled: true
        default: false
        controllerValue: k8s.io/ingress-nginx-public-nlb-tls
      ingressClass: nginx-public-nlb-tls
      ingressClassByName: true
      electionID: ingress-nginx-public-nlb-tls-leader
      {{- if .Values.ingressControllerPublicNlbTls.metrics.enabled }}
      metrics:
        enabled: true
        service:
          annotations:
            prometheus.io/port: "10254"
            prometheus.io/scrape: "true"
      {{- end }}
      service:
      {{- if .Values.ingressControllerPublicNlbTls.tlsIngress.tlsSupport }} 
        targetPorts:
          http: http
          https: http
        annotations:
          nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
          service.beta.kubernetes.io/aws-load-balancer-type: nlb
          service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
          service.beta.kubernetes.io/aws-load-balancer-backend-protocol: TLS
          service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
          service.beta.kubernetes.io/aws-load-balancer-ssl-cert: {{ .Values.ingressControllerPublicNlbTls.acmArn }}
      {{ else }}
        targetPorts:
          http: http
          https: https
        annotations:
          nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
          service.beta.kubernetes.io/aws-load-balancer-type: nlb
          service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
      {{ end }}
          service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: "Monitoring=enabled"
      podAnnotations:
        co.elastic.logs/processors.0.decode_json_fields.fields: message
        co.elastic.logs/processors.0.decode_json_fields.target: lb
      config:
        log-format-escape-json: true
        log-format-upstream: '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
          "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
          "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
          "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
          "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
          "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
          "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
          "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
          "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
          "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
          "sslProtocol":"$ssl_protocol"}'
        http-snippet: >-
          log_format bodyinfo escape=json '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
          "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
          "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
          "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
          "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
          "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
          "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
          "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
          "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
          "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
          "sslProtocol":"$ssl_protocol","requestBody":"[$request_body]"}';
      admissionWebhooks:
        timeoutSeconds: 30
      replicaCount: {{ .Values.ingressControllerPublicNlbTls.replicaCount }}
      minAvailable: {{ max 1 ( sub .Values.ingressControllerPublicNlbTls.replicaCount 1 ) }}
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app.kubernetes.io/name
                operator: In
                values:
                - ingress-nginx
              - key: app.kubernetes.io/instance
                operator: In
                values:
                - ingress-nginx-public-nlb-tls
              - key: app.kubernetes.io/component
                operator: In
                values:
                - controller
            topologyKey: "kubernetes.io/hostname"
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app.kubernetes.io/instance: ingress-nginx-public-nlb-tls

ingress-controller-internal-nginx:

        controller:
          resources:
            requests:
              cpu: 100m
              memory: 500Mi
            limits:
              cpu: 2
              memory: 2000Mi
          hostNetwork: true
          ingressClass: ingress-controller-internal-nginx

          ingressClassResource:
            controllerValue: "k8s.io/ingress-nginx-internal"
            name: ingress-controller-internal-nginx
          electionID: "ingress-controller-internal-leader"
          {{- if .Values.ingressControllerInternal.metrics.enabled }}
          metrics:
            enabled: true
            service:
              annotations:
                prometheus.io/port: "10254"
                prometheus.io/scrape: "true"
          {{- end }}
          service:
            targetPorts:
              http: http
              https: http
            annotations:
              nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
              service.beta.kubernetes.io/aws-load-balancer-type: nlb
              service.beta.kubernetes.io/aws-load-balancer-internal: "true"
              service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
              service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
              service.beta.kubernetes.io/aws-load-balancer-backend-protocol: TLS
              service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
              service.beta.kubernetes.io/aws-load-balancer-ssl-cert: {{ .Values.ingressControllerInternal.acm_arn }}
              service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: "Monitoring=enabled"
              # it doesn't work, aws-load-balancer-type must be changed to "external"
              # service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: "preserve_client_ip.enabled=false"
          podAnnotations:
            co.elastic.logs/processors.0.decode_json_fields.fields: message
            co.elastic.logs/processors.0.decode_json_fields.target: lb
          config:
            log-format-escape-json: true
            log-format-upstream: '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
              "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
              "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
              "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
              "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
              "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
              "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
              "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
              "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
              "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
              "sslProtocol":"$ssl_protocol"}'
            http-snippet: >-
              log_format bodyinfo escape=json '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
              "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
              "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
              "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
              "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
              "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
              "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
              "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
              "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
              "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
              "sslProtocol":"$ssl_protocol","requestBody":"[$request_body]"}';
          admissionWebhooks:
            timeoutSeconds: 30
          replicaCount: {{ .Values.ingressControllerInternal.replicaCount }}
          minAvailable: {{ max 1 ( sub .Values.ingressControllerInternal.replicaCount 1 ) }}
          affinity:
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app.kubernetes.io/name
                    operator: In
                    values:
                    - ingress-nginx
                  - key: app.kubernetes.io/instance
                    operator: In
                    values:
                    - ingress-nginx-internal
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                    - controller
                topologyKey: "kubernetes.io/hostname"
          topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: ScheduleAnyway
              labelSelector:
                matchLabels:
                  app.kubernetes.io/instance: ingress-nginx-internal

If helm was not used, then copy/paste the complete precise command used to install the controller, along with the flags and options used
if you have more than one instance of the ingress-nginx-controller installed in the same cluster, please provide details for all the instances

Current State of the controller:
- kubectl describe ingressclasses

Name:         ingress-controller-internal-nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx-internal
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.8.1
              argocd.argoproj.io/instance=ingress-nginx-internal-euw1-1
              helm.sh/chart=ingress-nginx-4.7.1
Annotations:  <none>
Controller:   k8s.io/ingress-nginx-internal
Events:       <none>

Name:         nginx-public-nlb-tls
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.9.5
              argocd.argoproj.io/instance=ingress-nginx-public-nlb-tls-euw1-1
              helm.sh/chart=ingress-nginx-4.9.0
Annotations:  <none>
Controller:   k8s.io/ingress-nginx-public-nlb-tls
Events:       <none>

kubectl -n <ingresscontrollernamespace> get all -A -o wide
kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>

Name:             ingress-nginx-public-nlb-tls-controller-6fbb668d64-prgkp
Namespace:        cluster
Priority:         0
Service Account:  ingress-nginx-public-nlb-tls
Node:             ip-10-229-145-39.eu-west-1.compute.internal/10.229.145.39
Start Time:       Tue, 23 Jan 2024 10:02:33 +0000
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=ingress-nginx
                  app.kubernetes.io/version=1.9.5
                  helm.sh/chart=ingress-nginx-4.9.0
                  pod-template-hash=6fbb668d64
Annotations:      co.elastic.logs/processors.0.decode_json_fields.fields: message
                  co.elastic.logs/processors.0.decode_json_fields.target: lb
                  kubectl.kubernetes.io/restartedAt: 2024-01-23T10:02:32Z
Status:           Running
IP:               10.229.145.39
IPs:
  IP:           10.229.145.39
Controlled By:  ReplicaSet/ingress-nginx-public-nlb-tls-controller-6fbb668d64
Containers:
  controller:
    Container ID:    containerd://3c0d0d081c8986c9bea84aa03e8c944848f35c415aca7d9d3e7dbc046eb3b346
    Image:           registry.k8s.io/ingress-nginx/controller:v1.9.5@sha256:b3aba22b1da80e7acfc52b115cae1d4c687172cbf2b742d5b502419c25ff340e
    Image ID:        registry.k8s.io/ingress-nginx/controller@sha256:b3aba22b1da80e7acfc52b115cae1d4c687172cbf2b742d5b502419c25ff340e
    Ports:           80/TCP, 443/TCP, 8443/TCP
    Host Ports:      80/TCP, 443/TCP, 8443/TCP
    SeccompProfile:  RuntimeDefault
    Args:
      /nginx-ingress-controller
      --publish-service=$(POD_NAMESPACE)/ingress-nginx-public-nlb-tls-controller
      --election-id=ingress-nginx-public-nlb-tls-leader
      --controller-class=k8s.io/ingress-nginx-public-nlb-tls
      --ingress-class=nginx-public-nlb-tls
      --configmap=$(POD_NAMESPACE)/ingress-nginx-public-nlb-tls-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --ingress-class-by-name=true
    State:          Running
      Started:      Tue, 23 Jan 2024 10:02:34 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  2000Mi
    Requests:
      cpu:      100m
      memory:   500Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-public-nlb-tls-controller-6fbb668d64-prgkp (v1:metadata.name)
      POD_NAMESPACE:  cluster (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-h9dd4 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-nginx-public-nlb-tls-admission
    Optional:    false
  kube-api-access-h9dd4:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:               kubernetes.io/os=linux
Tolerations:                  node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  topology.kubernetes.io/zone:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
Events:                       <none>

kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>

  Name:                     ingress-nginx-public-nlb-tls-controller
Namespace:                cluster
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.9.5
                          argocd.argoproj.io/instance=ingress-nginx-public-nlb-tls-euw1-1
                          helm.sh/chart=ingress-nginx-4.9.0
Annotations:              nginx.ingress.kubernetes.io/force-ssl-redirect: true
                          service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: Monitoring=enabled
                          service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: 60
                          service.beta.kubernetes.io/aws-load-balancer-type: nlb
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx-public-nlb-tls,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       172.20.60.210
IPs:                      172.20.60.210
LoadBalancer Ingress:     <redacted>
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  30881/TCP
Endpoints:                10.229.145.39:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  31447/TCP
Endpoints:                10.229.145.39:443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

Current state of ingress object, if applicable:
- kubectl -n <appnamespace> get all,ing -o wide
- kubectl -n <appnamespace> describe ing <ingressname>

  Name:             neru-59e69cd7-go-neru-queue-scheduler-dev-com
Labels:           <none>
Namespace:        omega
Address:          <redacted>.elb.eu-west-1.amazonaws.com
Ingress Class:    nginx-public-nlb-tls
Default backend:  <default>
TLS:
  default-ingress-ssl terminates <redacted>
Rules:
  Host                                                                       Path  Backends
  ----                                                                       ----  --------
  <redacted>
                                                                             /   envoy:80 (172.16.90.147:5000)
Annotations:                                                                 nginx.ingress.kubernetes.io/backend-protocol: HTTP
                                                                             nginx.ingress.kubernetes.io/upstream-vhost: <redacted>
Events:                                                                      <none>

If applicable, then, your complete and exact curl/grpcurl command (redacted if required) and the reponse to the curl/grpcurl command with the -v flag

Others:
- Any other related information like ;
- copy/paste of the snippet (if applicable)
- kubectl describe ... of any custom configmap(s) created and in use
- Any other related information that may help

How to reproduce this issue:

Anything else we need to know:

k8s-ci-robot commented 8 months ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

longwuyuan commented 8 months ago

/remove-kind bug

Look at the questions that are asked in a new bug report template. You have not provided any helpful info to someone who reads this issue
Edit this issue and answer those questions. Readers will have some data to make comments on if you do that
Try to make sure that the each instance of the ingress-ngin controller in the multi-instance scene is unique as per this info https://kubernetes.github.io/ingress-nginx/user-guide/k8s-122-migration/#how-can-i-easily-install-multiple-instances-of-the-ingress-nginx-controller-in-the-same-cluster

longwuyuan commented 8 months ago

/triage needs-information

mdellavedova commented 8 months ago

Sorry I posted by mistake before completing the form, please let me know if there's anything else I need to add

mdellavedova commented 8 months ago

Hi, I can see the triage/needs-information label is still there after I updated the form last week, could you please let me know if there's anything missing?

longwuyuan commented 8 months ago

"Ignoring ingress" does not indicate that the ingress rules were used for routing
The most important aspect here is to confirm that you installed as per the link i pasted here earlier
The proof needed is that appropriate controller instance processes appropriate ingress rule routing

mdellavedova commented 7 months ago

Thanks for your reply

"Ignoring ingress" does not indicate that the ingress rules were used for routing

I'm sure the rules aren't used for routing although I have a large number of ingresses that get pointlessly evaluated causing an increase in load for 1 of the 3 pods in the deployment which lead to restarts. (every time there is a batch of "Ignoring ingress" errors in the logs one of the pod restarts)

The most important aspect here is to confirm that you installed as per the link i pasted here earlier

I have followed that guide and double checked the configuration multiple times

The proof needed is that appropriate controller instance processes appropriate ingress rule routing

that's confirmed, the 2 ingress controllers only process their own ingress rules, the issue is the "Ignoring ingress" errors and the associated pod restarts

longwuyuan commented 7 months ago

I just tested 2 controllers on minikube and I could not reproduce the restart of pods
So it seems like that error and restart are coinciding for you but one is not related to the other.
Can you try to reproduce on minikube or kind cluster
I think you can look at
- kubectl get events -A
- dmesg on nodes
- syslog on nodes
- monitoring dashboards for cpu, memory, inodes, disk-usage, other resources

mdellavedova commented 7 months ago

thanks for your effort, I believe the restarts are due to the number of ingress resources being evaluated, I have a similar setup in 3 separate regions:

region 1: total number of ingresses managed by both controllers: 1962 restarts controller 1: 33 over 20 days (1 of 3 pods only) restarts controller 2: 0 over 19 days

region 2: total number of ingresses managed by both controllers: 426 restarts controller 1: 0 over 20 days restarts controller 2: 123 over 19 days (1 of 3 pods only)

region 3: total number of ingresses managed by both controllers: 192 restarts controller 1: 0 over 20 days restarts controller 2: 0 over 19 days (1 of 3 pods only)

could you please re-run your test with a higher number of ingresses? I'm not sure why there is no correlation between the number of ingress resources and the number of restarts, I will try and look at the traffic in region 2 vs region 1

longwuyuan commented 7 months ago

I don't have hardware or automation to do 192 ingresses test
I think some data is needed to point at a problem in the ingress-controller here because once the resources like cpu/memory/diskspace/inodes/tcp-buffers etc are in shortage then it does not mean that there is a problem with the ingress-nginx controller, even though the controller's pods may are restarting

github-actions[bot] commented 6 months ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

kubernetes / ingress-nginx

Ingress controller processing ingresses with different ingressClassName when multiple ingress controllers are deployed in the same namespace - AWS EKS 1.27 #10907

I have also tried the latest available helm chart, which didn't help