kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.36k stars 8.23k forks source link

Controller resolves default-backend pod IP as DNS name if Ingress leads to ExternalName service #12136

Closed meatuses closed 1 day ago

meatuses commented 3 days ago

What happened: If an ingress resource leads to service with type ExternalName, but also has annotation nginx.ingress.kubernetes.io/default-backend with values set to a service with type ClusterIP, ingress-nginx-controller tries to resolve pod IP of said ClusterIP service as a DNS name. I have attached manifests down in Others section.

A lot of following errors are generated in ingress-nginx-controller logs. 10.111.0.218 is IP of a pod for default-backend service:

2024/10/07 11:47:30 [error] 30#30: *14896 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.218:
server returned error code: 3: name error
server returned error code: 3: name error, context: ngx.timer

Seems that the ClusterIP service somehow matched with this condition https://github.com/kubernetes/ingress-nginx/blob/controller-v1.11.2/rootfs/etc/nginx/lua/tcp_udp_balancer.lua#L74-L79

What you expected to happen:

Ingress-nginx-controller does not try to resolve IP addresses as DNS names.

NGINX Ingress controller version v1.11.2

Kubernetes version: v1.27.16

Environment:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/ingress-nginx-controller LoadBalancer 10.222.73.9 80:30867/TCP,443:30264/TCP 20m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx service/ingress-nginx-controller-admission ClusterIP 10.222.252.199 443/TCP 20m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx service/nginx-ingress NodePort 10.222.20.200 80:31564/TCP,443:31630/TCP 12m app.kubernetes.io/name=ingress-nginx

NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/ingress-nginx-controller 1/1 1 1 20m controller registry.k8s.io/ingress-nginx/controller:v1.11.2@sha256:d5f8217feeac4887cb1ed21f27c2674e58be06bd8f5184cacea2a69abaf78dce app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx

NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/ingress-nginx-controller-6bbf7f5879 1 1 1 20m controller registry.k8s.io/ingress-nginx/controller:v1.11.2@sha256:d5f8217feeac4887cb1ed21f27c2674e58be06bd8f5184cacea2a69abaf78dce app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx,pod-template-hash=6bbf7f5879

  - `kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>`

kubectl -n ingress-nginx describe pod ingress-nginx-controller-6bbf7f5879-b8jg4

Name: ingress-nginx-controller-6bbf7f5879-b8jg4 Namespace: ingress-nginx Priority: 1000 Priority Class Name: develop Service Account: ingress-nginx Node: ob-ingress-nginx-test-0/10.128.0.31 Start Time: Mon, 07 Oct 2024 11:35:07 +0000 Labels: app.kubernetes.io/component=controller app.kubernetes.io/instance=ingress-nginx app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx app.kubernetes.io/version=1.11.2 helm.sh/chart=ingress-nginx-4.11.2 pod-template-hash=6bbf7f5879 Annotations: Status: Running IP: 10.111.0.197 IPs: IP: 10.111.0.197 Controlled By: ReplicaSet/ingress-nginx-controller-6bbf7f5879 Containers: controller: Container ID: containerd://798fd3f16ce93d29367d06a497adb58483797996c0df5d0fcddfbce4e6c4e5c6 Image: registry.k8s.io/ingress-nginx/controller:v1.11.2@sha256:d5f8217feeac4887cb1ed21f27c2674e58be06bd8f5184cacea2a69abaf78dce Image ID: registry.k8s.io/ingress-nginx/controller@sha256:d5f8217feeac4887cb1ed21f27c2674e58be06bd8f5184cacea2a69abaf78dce Ports: 80/TCP, 443/TCP, 8443/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP SeccompProfile: RuntimeDefault Args: /nginx-ingress-controller --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller --election-id=ingress-nginx-leader --controller-class=k8s.io/ingress-nginx --ingress-class=nginx --configmap=$(POD_NAMESPACE)/ingress-nginx-controller --validating-webhook=:8443 --validating-webhook-certificate=/usr/local/certificates/cert --validating-webhook-key=/usr/local/certificates/key --enable-metrics=false State: Running Started: Mon, 07 Oct 2024 11:35:09 +0000 Ready: True Restart Count: 0 Requests: cpu: 100m memory: 90Mi Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5 Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3 Environment: POD_NAME: ingress-nginx-controller-6bbf7f5879-b8jg4 (v1:metadata.name) POD_NAMESPACE: ingress-nginx (v1:metadata.namespace) LD_PRELOAD: /usr/local/lib/libmimalloc.so Mounts: /usr/local/certificates/ from webhook-cert (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-q49pf (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: webhook-cert: Type: Secret (a volume populated by a Secret) SecretName: ingress-nginx-admission Optional: false kube-api-access-q49pf: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Normal Scheduled 8m default-scheduler Successfully assigned ingress-nginx/ingress-nginx-controller-6bbf7f5879-b8jg4 to ob-ingress-nginx-test-0 Normal Pulled 7m59s kubelet Container image "registry.k8s.io/ingress-nginx/controller:v1.11.2@sha256:d5f8217feeac4887cb1ed21f27c2674e58be06bd8f5184cacea2a69abaf78dce" already present on machine Normal Created 7m59s kubelet Created container controller Normal Started 7m59s kubelet Started container controller Normal RELOAD 7m57s nginx-ingress-controller NGINX reload triggered due to a change in configuration

  - `kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>`

kubectl -n ingress-nginx describe svc nginx-ingress

Name: nginx-ingress Namespace: ingress-nginx Labels: Annotations: Selector: app.kubernetes.io/name=ingress-nginx Type: NodePort IP Family Policy: SingleStack IP Families: IPv4 IP: 10.222.20.200 IPs: 10.222.20.200 Port: http 80/TCP TargetPort: 80/TCP NodePort: http 31564/TCP Endpoints: 10.111.0.197:80 Port: https 443/TCP TargetPort: 443/TCP NodePort: https 31630/TCP Endpoints: 10.111.0.197:443 Session Affinity: None External Traffic Policy: Cluster Events:


- **Current state of ingress object, if applicable**:
  - `kubectl -n <appnamespace> get all,ing -o wide`

kubectl -n app get all,ing -owide

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/nginx-errors 1/1 Running 0 50m 10.111.0.218 ob-ingress-nginx-test-0

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/nginx-errors ClusterIP 10.222.4.219 80/TCP 50m app=errors service/static-test-storage ExternalName staging.storage.example.com 50m

NAME CLASS HOSTS ADDRESS PORTS AGE ingress.networking.k8s.io/external-name-ingress nginx static.test.com 80 49m

  - `kubectl -n <appnamespace> describe ing <ingressname>`

kubectl -n app describe ingress external-name-ingress

Name: external-name-ingress Labels: Namespace: app Address: Ingress Class: nginx Default backend: Rules: Host Path Backends


static.test.com / static-test-storage:443 (<error: endpoints "static-test-storage" not found>) Annotations: nginx.ingress.kubernetes.io/backend-protocol: HTTPS nginx.ingress.kubernetes.io/custom-http-errors: 500,501,502,503,504 nginx.ingress.kubernetes.io/default-backend: nginx-errors nginx.ingress.kubernetes.io/upstream-vhost: staging.storage.example.com Events: Type Reason Age From Message


Normal AddedOrUpdated 50m nginx-ingress-controller Configuration for app/external-name-ingress was added or updated Normal Sync 22m nginx-ingress-controller Scheduled for sync Normal Sync 10m nginx-ingress-controller Scheduled for sync

  - If applicable, then, your complete and exact curl/grpcurl command (redacted if required) and the reponse to the curl/grpcurl command with the -v flag

- **Others**:
  - Any other related information like ;
    - copy/paste of the snippet (if applicable)
    - `kubectl describe ...` of any custom configmap(s) created and in use
    - Any other related information that may help

ingress yaml:

kubectl -n app get ingress external-name-ingress -oyaml

apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/backend-protocol: HTTPS nginx.ingress.kubernetes.io/custom-http-errors: 500,501,502,503,504 nginx.ingress.kubernetes.io/default-backend: nginx-errors nginx.ingress.kubernetes.io/upstream-vhost: staging.storage.example.com creationTimestamp: "2024-10-07T10:55:38Z" generation: 1 name: external-name-ingress namespace: app resourceVersion: "53425" uid: 172919a8-0407-4b98-a11b-542db1538814 spec: ingressClassName: nginx rules:

ClusterIP service connected to pod that ingress-nginx tries to resolve as DNS:

# kubectl -n app get svc nginx-errors -oyaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2024-10-07T10:54:25Z"
  labels:
    service: nginx-errors
  name: nginx-errors
  namespace: app
  resourceVersion: "74017"
  uid: 373a9a24-4c53-4ae2-b83c-ffc5ea25a9c3
spec:
  clusterIP: 10.222.4.219
  clusterIPs:
  - 10.222.4.219
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: errors
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

the pod:

# kubectl -n app get pod -owide --show-labels
NAME           READY   STATUS    RESTARTS   AGE   IP             NODE                      NOMINATED NODE   READINESS GATES   LABELS
nginx-errors   1/1     Running   0          59m   10.111.0.218   ob-ingress-nginx-test-0   <none>           <none>            app=errors

How to reproduce this issue:

  1. Have working Kubernetes cluster
  2. Install nginx using Quick Start Helm
  3. Deploy ingress, service ExternalName, service ClusterIP + pod with manifests as in the above Others section.
  4. Check logs of ingress-nginx-controller, see dns_lookup(): failed to query the DNS server for 10.111.0.218 error.
k8s-ci-robot commented 3 days ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 2 days ago

The information you have provied is not clear.

longwuyuan commented 2 days ago

/remove-kind bug /kind support /triage needs-information

meatuses commented 2 days ago

Why do you have 2 services named ingress-nginx (one --type LoadBalancer and another --type NodePort)

I don't have LB in testing environment, I've added a NodePort just to make sure controller is up and running. Ingress-nginx services are not related to the problem. In fact, I've deleted every nginx svc beside admission (in ns ingress-nginx), issue persists. Because you don't have to have working frontend to reproduce it.

I suggest you uninstall the controller and delete the namespace Then delete that service type NodePort Ensure there is nothing related to ingress-nginx controller on the lcuster Then do a default install of the controller using helm

Ok, I've reinstalled ingress-nginx through helm quick start:

helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx --create-namespace

Install Metallb.io Configure metallb.io for L2 as per metlalb docs

Not related, didn't do. We have a production environment with LBs where this issue does reproduce.

Change the service --type ExternalName to have the externalName field to point to www.google.com Remove all annotations except default-backend from the ingress

Sure. But removing custom-http-errors causes logging to stop, seems that controller ignores default-backend without this annotation. Current manifests in app namespace:

# kubectl -n app get ingress,svc -oyaml
apiVersion: v1
items:
- apiVersion: networking.k8s.io/v1
  kind: Ingress
  metadata:
    annotations:
      nginx.ingress.kubernetes.io/custom-http-errors: "500"
      nginx.ingress.kubernetes.io/default-backend: nginx-errors
    creationTimestamp: "2024-10-07T10:55:38Z"
    generation: 1
    name: external-name-ingress
    namespace: app
    resourceVersion: "839585"
    uid: 172919a8-0407-4b98-a11b-542db1538814
  spec:
    ingressClassName: nginx
    rules:
    - host: static.test.com
      http:
        paths:
        - backend:
            service:
              name: static-test-storage
              port:
                number: 443
          path: /
          pathType: Prefix
  status:
    loadBalancer: {}
- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: "2024-10-07T10:54:25Z"
    labels:
      service: nginx-errors
    name: nginx-errors
    namespace: app
    resourceVersion: "74017"
    uid: 373a9a24-4c53-4ae2-b83c-ffc5ea25a9c3
  spec:
    clusterIP: 10.222.4.219
    clusterIPs:
    - 10.222.4.219
    internalTrafficPolicy: Cluster
    ipFamilies:
    - IPv4
    ipFamilyPolicy: SingleStack
    ports:
    - name: http
      port: 80
      protocol: TCP
      targetPort: 80
    selector:
      app: errors
    sessionAffinity: None
    type: ClusterIP
  status:
    loadBalancer: {}
- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: "2024-10-07T10:54:35Z"
    name: static-test-storage
    namespace: app
    resourceVersion: "833889"
    uid: 0d6383a1-06a0-435d-88d1-21d86cf28b9f
  spec:
    externalName: www.google.com
    sessionAffinity: None
    type: ExternalName
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""

controller still tries to resolve pod IP. Note that I'm not even sending any requests into this ingress:

root@ob-ingress-nginx-test-0:~# kubectl -n ingress-nginx logs ingress-nginx-controller-6bbf7f5879-ddbwz  --tail=9
2024/10/08 10:58:41 [error] 304#304: *9605 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.218:
server returned error code: 3: name error
server returned error code: 3: name error, context: ngx.timer
2024/10/08 10:58:41 [error] 302#302: *9609 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.218:
server returned error code: 3: name error
server returned error code: 3: name error, context: ngx.timer
2024/10/08 10:58:41 [error] 303#303: *9613 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.218:
server returned error code: 3: name error
server returned error code: 3: name error, context: ngx.timer
root@ob-ingress-nginx-test-0:~# kubectl -n app get pod -owide --show-labels
NAME           READY   STATUS    RESTARTS   AGE   IP             NODE                      NOMINATED NODE   READINESS GATES   LABELS
nginx-errors   1/1     Running   0          24h   10.111.0.218   ob-ingress-nginx-test-0   <none>           <none>            app=errors
longwuyuan commented 2 days ago

I can't understand your test

longwuyuan commented 1 day ago

The service created by the ingress-nginx controller is in a pending state

service/ingress-nginx-controller             LoadBalancer   10.222.73.9      <pending>     80:30867/TCP,443:30264/TCP   20m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx

So the tests you conduct or the status you report is not valid.

If you need more support on this, then kindly show a valid test related debug data, for someone to analyze. I will close this issue for now as we know that there are many users with service object of --type externalName.

Once you have posted data with valid tests, then you can re-open this issue.

/close

k8s-ci-robot commented 1 day ago

@longwuyuan: Closing this issue.

In response to [this](https://github.com/kubernetes/ingress-nginx/issues/12136#issuecomment-2401368356): >The service created by the ingress-nginx controller is in a pending state >``` >service/ingress-nginx-controller LoadBalancer 10.222.73.9 80:30867/TCP,443:30264/TCP 20m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx >``` > >So the tests you conduct or the status you report is not valid. > >If you need more support on this, then kindly show a valid test related debug data, for someone to analyze. I will close this issue for now as we know that there are many users with service object of --type externalName. > >Once you have posted data with valid tests, then you can re-open this issue. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.