Open meatuses opened 1 month ago
This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Hi,
Another user has raised issue that has similarities with this one.
/kind feature /remove-kind bug
@longwuyuan There are two ways to define default-backend, the global backend and the annotation backend. They are not the same:
But your curl command hostname is understood by controller and that ExterNameService will never ever have a endpoint. So there is no design/code to handle this use-case of default-backend See below screenshot
If the nginx.ingress.kubernetes.io/custom-http-errors
annotation is specified (it is specified in this case), then the annotation default-backend will handle HTTP errors coming from the service.
I already hinted why this case is not working https://github.com/kubernetes/ingress-nginx/issues/12158#issuecomment-2407367484. This PR fixes it https://github.com/kubernetes/ingress-nginx/pull/12160.
First its required to state that default-backend is for requests that controller does not understand. But your curl command hostname is understood by controller and that ExterNameService will never ever have a endpoint.
The issue here is:
In my curl request I was not trying to trigger an error to lead me to default-backend. Because it's not needed to see the issue. These logs are generated at a rate of around 4 events per second, without any requests made to the ingress:
2024-10-14T14:27:21.097969668Z 2024/10/14 14:27:21 [error] 908#908: *170252 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.170:
2024-10-14T14:27:21.098230504Z server returned error code: 3: name error
2024-10-14T14:27:21.098256945Z server returned error code: 3: name error, context: ngx.timer
2024-10-14T14:27:21.146672207Z 2024/10/14 14:27:21 [error] 909#909: *170257 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.170:
2024-10-14T14:27:21.146714162Z server returned error code: 3: name error
2024-10-14T14:27:21.146719921Z server returned error code: 3: name error, context: ngx.timer
2024-10-14T14:27:21.519411144Z 2024/10/14 14:27:21 [error] 911#911: *170262 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.170:
2024-10-14T14:27:21.519443371Z server returned error code: 3: name error
2024-10-14T14:27:21.519449218Z server returned error code: 3: name error, context: ngx.timer
2024-10-14T14:27:21.547436939Z 2024/10/14 14:27:21 [error] 910#910: *170267 [lua] dns.lua:152: dns_lookup(): failed to query the DNS server for 10.111.0.170:
2024-10-14T14:27:21.547480708Z server returned error code: 3: name error
2024-10-14T14:27:21.547486514Z server returned error code: 3: name error, context: ngx.timer
The bigger problem is the direction this is going. From the other related issue, its evident that crossnamespace can be attempted so withouit discussing it in public, I want to state that crossnamespace is not desired by the project in this shape & form
This is my opinion and so wait for other comments on this and your PR
DNS servers that resolve my ExternalName
@longwuyuan I think I had empathised enough that the issue is not with ExternalName service itself, it resolves fine. The issue is: if you have default-backend
service with custom-http-errors
set in your ingress, controller tries to resolve IP of a pod that was linked with default-backend
service as if it was a DNS name. I think it's clear from log records I have provided.
Looking at resources at your screenshots, I think if you add nginx.ingress.kubernetes.io/custom-http-errors: "500"
annotation to your ingress (you can input any error codes you like), the issues will appear in your controller's logs. Without any curl/other requests to your ingress or trying to trigger these http errors in the externalname.
Also, disconnecting DNS servers or anything related to DNS settings is not related to the issue.
@meatuses thank you for your update. It helps. I will try now and update
@meatuses @chessman thanks for the tips
Now I can reproduce the problem that if the backend of a ingress is a service of --type ExternalName and the ingress also has the annotations default-backend configured with the annotation custom-http-errors, there is a flood/spam in the logs that look like this ;
*_2024/10/14 17:28:49 [error] 1644#1644: 157646 [lua] dns.lua:152: dnslookup(): failed to query the DNS server for 10.244.0.26: server returned error code: 3: name error server returned error code: 3: name error, context: ngx.timer**
The config looks like the screenshot below ;
To me this is a rare use-case that has never come to light before
The analysis beyond this is a deep dive into the implementation of the default-backend annotation
This is already hinting at a possible security problem because @chessman was discussing creating service --type ExternalName for a service in another namespace
I think we need to wait for comments from others as I am not a developer
What happened: If an ingress resource leads to service with type
ExternalName
, but also has annotationnginx.ingress.kubernetes.io/default-backend
with the value set to a service with typeClusterIP
, ingress-nginx-controller tries to resolve pod IP of said ClusterIP service as a DNS name. I have attached manifests down in Others section.A lot of following errors are generated in
ingress-nginx-controller
logs.10.111.0.170
is IP of a pod for default-backend service:Seems that the ClusterIP service somehow matched with this condition https://github.com/kubernetes/ingress-nginx/blob/controller-v1.11.3/rootfs/etc/nginx/lua/tcp_udp_balancer.lua#L74-L78
What you expected to happen:
Ingress-nginx-controller
does not try to resolve IP addresses as DNS names.NGINX Ingress controller version v1.11.3
Kubernetes version: v1.27.16
Environment:
Cloud provider or hardware configuration: Bare metal 4 CPU, 8GiB, single node cluster
OS (e.g. from /etc/os-release): Ubuntu 22.04
Kernel (e.g.
uname -a
): 5.15.0-122-genericInstall tools: via Quick Start Helm to reproduce the issue
Basic cluster related info:
kubectl version
kubectl get nodes -o wide
How was the ingress-nginx-controller installed:
helm ls -A | grep -i ingress
helm -n <ingresscontrollernamespace> get values <helmreleasename>
Current State of the controller:
kubectl describe ingressclasses
kubectl -n <ingresscontrollernamespace> get all -A -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/ingress-nginx-controller LoadBalancer 10.222.34.99 10.128.0.40 80:30830/TCP,443:31937/TCP 30m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx service/ingress-nginx-controller-admission ClusterIP 10.222.155.122 443/TCP 30m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/ingress-nginx-controller 1/1 1 1 30m controller registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/ingress-nginx-controller-5979bb57db 1 1 1 30m controller registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx,pod-template-hash=5979bb57db
kubectl -n ingress-nginx describe pod ingress-nginx-controller-5979bb57db-s7wzm
Name: ingress-nginx-controller-5979bb57db-s7wzm Namespace: ingress-nginx Priority: 1000 Priority Class Name: develop Service Account: ingress-nginx Node: ob-ingress-nginx-test-0/10.128.0.31 Start Time: Mon, 14 Oct 2024 11:33:58 +0000 Labels: app.kubernetes.io/component=controller app.kubernetes.io/instance=ingress-nginx app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx app.kubernetes.io/version=1.11.3 helm.sh/chart=ingress-nginx-4.11.3 pod-template-hash=5979bb57db Annotations:
Status: Running
IP: 10.111.0.188
IPs:
IP: 10.111.0.188
Controlled By: ReplicaSet/ingress-nginx-controller-5979bb57db
Containers:
controller:
Container ID: containerd://93e8c789a844dfb2257501727b92726d5f49ff1de7bb48b3d20a0ea3ea09992a
Image: registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7
Image ID: registry.k8s.io/ingress-nginx/controller@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7
Ports: 80/TCP, 443/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
SeccompProfile: RuntimeDefault
Args:
/nginx-ingress-controller
--publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
--election-id=ingress-nginx-leader
--controller-class=k8s.io/ingress-nginx
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/ingress-nginx-controller
--validating-webhook=:8443
--validating-webhook-certificate=/usr/local/certificates/cert
--validating-webhook-key=/usr/local/certificates/key
--enable-metrics=false
State: Running
Started: Mon, 14 Oct 2024 11:34:01 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 90Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-controller-5979bb57db-s7wzm (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/usr/local/certificates/ from webhook-cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ftlkh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
webhook-cert:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-nginx-admission
Optional: false
kube-api-access-ftlkh:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Normal Scheduled 23m default-scheduler Successfully assigned ingress-nginx/ingress-nginx-controller-5979bb57db-s7wzm to ob-ingress-nginx-test-0 Normal Pulled 23m kubelet Container image "registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7" already present on machine Normal Created 23m kubelet Created container controller Normal Started 23m kubelet Started container controller Normal RELOAD 4m32s (x7 over 23m) nginx-ingress-controller NGINX reload triggered due to a change in configuration
kubectl -n ingress-nginx describe svc ingress-nginx-controller
Name: ingress-nginx-controller Namespace: ingress-nginx Labels: app.kubernetes.io/component=controller app.kubernetes.io/instance=ingress-nginx app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=ingress-nginx app.kubernetes.io/part-of=ingress-nginx app.kubernetes.io/version=1.11.3 helm.sh/chart=ingress-nginx-4.11.3 Annotations: meta.helm.sh/release-name: ingress-nginx meta.helm.sh/release-namespace: ingress-nginx metallb.universe.tf/ip-allocated-from-pool: frontend-pool Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 10.222.34.99 IPs: 10.222.34.99 LoadBalancer Ingress: 10.128.0.40 Port: http 80/TCP TargetPort: http/TCP NodePort: http 30830/TCP Endpoints: 10.111.0.188:80 Port: https 443/TCP TargetPort: https/TCP NodePort: https 31937/TCP Endpoints: 10.111.0.188:443 Session Affinity: None External Traffic Policy: Cluster Events: Type Reason Age From Message
Normal IPAllocated 33m metallb-controller Assigned IP ["10.128.0.40"] Normal nodeAssigned 26m (x2 over 32m) metallb-speaker announcing from node "ob-ingress-nginx-test-0" with protocol "layer2"
kubectl -n app get all,ing -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/nginx-errors 1/1 Running 1 (2d21h ago) 7d1h 10.111.0.170 ob-ingress-nginx-test-0
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/external-name-svc-test ExternalName www.google.com 22m
service/nginx-errors ClusterIP 10.222.4.219 80/TCP 7d1h app=errors
NAME CLASS HOSTS ADDRESS PORTS AGE ingress.networking.k8s.io/external-name-ingress nginx static.test.com 10.128.0.40 80 7d1h
kubectl -n app describe ingress external-name-ingress
Name: external-name-ingress Labels:
Namespace: app
Address: 10.128.0.40
Ingress Class: nginx
Default backend:
Rules:
Host Path Backends
static.test.com / external-name-svc-test:443 (<error: endpoints "external-name-svc-test" not found>) Annotations: nginx.ingress.kubernetes.io/backend-protocol: HTTPS nginx.ingress.kubernetes.io/custom-http-errors: 500 nginx.ingress.kubernetes.io/default-backend: nginx-errors nginx.ingress.kubernetes.io/preserve-host: false Events: Type Reason Age From Message
Normal Sync 30m (x3 over 33m) nginx-ingress-controller Scheduled for sync Normal Sync 7m55s (x6 over 27m) nginx-ingress-controller Scheduled for sync
kubectl -n app get ingress external-name-ingress -oyaml
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/backend-protocol: HTTPS nginx.ingress.kubernetes.io/custom-http-errors: "500" nginx.ingress.kubernetes.io/default-backend: nginx-errors nginx.ingress.kubernetes.io/preserve-host: "false" creationTimestamp: "2024-10-07T10:55:38Z" generation: 2 name: external-name-ingress namespace: app resourceVersion: "3361207" uid: 172919a8-0407-4b98-a11b-542db1538814 spec: ingressClassName: nginx rules:
kubectl -n app get svc external-name-svc-test -oyaml
apiVersion: v1 kind: Service metadata: creationTimestamp: "2024-10-14T11:38:18Z" name: external-name-svc-test namespace: app resourceVersion: "3352798" uid: e5c7e91b-b105-4a33-8d50-35e2d2d9b1c0 spec: externalName: www.google.com sessionAffinity: None type: ExternalName status: loadBalancer: {}
ClusterIP of default-backend service connected to pod, which ingress-nginx tries to resolve as DNS:
the pod:
curl to ingress works (not sure why google returns 404 though):
How to reproduce this issue:
dns_lookup(): failed to query the DNS server for 10.111.0.170
error.