Open scott-kausler opened 1 year ago
This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
/remove-kind bug
Hi, this has been reported twice and this related to the change where the endpointslice is being used.
This Issue in itself, in its current state, does not contain enough data to hint at a action item. It would help a lot if there you write a step-by-step instruction to copy/paste and reproduce the problem on a minikube cluster or a kind cluster.
It is also likely that there could be a reason, so far unknown, as to why the endpoint slice does not get populated. Even for that it becomes more important to know a way to reproduce the problem and debug it (because just creating a workload with something like the image nginx:alpine, does not create this problem). Thanks
@scott-kausler please provide kubectl -n $ns get svc,ing,ep,endpointslice
and kubectl -n $ns get svc,ing,ep,endpointslice -o yaml
Hi I am having the same issue as reported in this ticket. I initially created a ticket under rancher Issue 41584 as I wasn't sure if it is a rancher issue or isolated to just the kubernetes ingress-nginx related issue. Is it possible to provide some insight as to why this can be happening?
Every 25 to 45 minutes one service is available but then during the next interval the Rancher GUI becomes unavailable "404 page not found"; "service "rancher Service does not have an active endpoint" error.
Hi @tombokombo I ran the commands as recommended; please refer to the output below:
apiVersion: v1
items:
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-coredns
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:29:46Z"
labels:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-coredns
helm.sh/chart: rke2-coredns-1.19.402
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
name: rke2-coredns-rke2-coredns
namespace: kube-system
resourceVersion: "668"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: udp-53
port: 53
protocol: UDP
targetPort: 53
- name: tcp-53
port: 53
protocol: TCP
targetPort: 53
selector:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/name: rke2-coredns
k8s-app: kube-dns
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1I
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-ingress-nginx
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:30:20Z"
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-ingress-nginx
app.kubernetes.io/part-of: rke2-ingress-nginx
app.kubernetes.io/version: 1.6.4
helm.sh/chart: rke2-ingress-nginx-4.5.201
name: rke2-ingress-nginx-controller-admission
namespace: kube-system
resourceVersion: "1183"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- appProtocol: https
name: https-webhook
port: 443
protocol: TCP
targetPort: webhook
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/name: rke2-ingress-nginx
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-metrics-server
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:30:09Z"
labels:
app: rke2-metrics-server
app.kubernetes.io/managed-by: Helm
chart: rke2-metrics-server-2.11.100-build2022101107
heritage: Helm
release: rke2-metrics-server
name: rke2-metrics-server
namespace: kube-system
resourceVersion: "5197581"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
- name: metrics
port: 10250
protocol: TCP
targetPort: 10250
selector:
app: rke2-metrics-server
release: rke2-metrics-server
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-snapshot-validation-webhook
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:30:10Z"
labels:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-snapshot-validation-webhook
app.kubernetes.io/version: v6.2.1
helm.sh/chart: rke2-snapshot-validation-webhook-1.7.100
name: rke2-snapshot-validation-webhook
namespace: kube-system
resourceVersion: "980"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/name: rke2-snapshot-validation-webhook
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:33Z"
creationTimestamp: "2023-05-02T17:29:46Z"
labels:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-coredns
helm.sh/chart: rke2-coredns-1.19.402
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
name: rke2-coredns-rke2-coredns
namespace: kube-system
resourceVersion: "5534372"
uid: REDACTED
subsets:
- addresses:
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-fg2th
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-n4p5l
namespace: kube-system
uid: REDACTED
ports:
- name: tcp-53
port: 53
protocol: TCP
- name: udp-53
port: 53
protocol: UDP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:23Z"
creationTimestamp: "2023-05-02T17:30:20Z"
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-ingress-nginx
app.kubernetes.io/part-of: rke2-ingress-nginx
app.kubernetes.io/version: 1.6.4
helm.sh/chart: rke2-ingress-nginx-4.5.201
name: rke2-ingress-nginx-controller-admission
namespace: kube-system
resourceVersion: "5534140"
uid: REDACTED
subsets:
- addresses:
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-2h95m
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-8hvtl
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-c8x24
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-df4lk
namespace: kube-system
uid: REDACTED
ports:
- appProtocol: https
name: https-webhook
port: 8443
protocol: TCP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:09Z"
creationTimestamp: "2023-05-02T17:30:09Z"
labels:
app: rke2-metrics-server
app.kubernetes.io/managed-by: Helm
chart: rke2-metrics-server-2.11.100-build2022101107
heritage: Helm
release: rke2-metrics-server
name: rke2-metrics-server
namespace: kube-system
resourceVersion: "5533133"
uid: REDACTED
subsets:
- addresses:
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-metrics-server-7d58bbc9c6-xvgg8
namespace: kube-system
uid: REDACTED
ports:
- name: metrics
port: 10250
protocol: TCP
- name: https
port: 10250
protocol: TCP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:10Z"
creationTimestamp: "2023-05-02T17:30:10Z"
labels:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-snapshot-validation-webhook
app.kubernetes.io/version: v6.2.1
helm.sh/chart: rke2-snapshot-validation-webhook-1.7.100
name: rke2-snapshot-validation-webhook
namespace: kube-system
resourceVersion: "5533131"
uid: REDACTED
subsets:
- addresses:
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-snapshot-validation-webhook-7748dbf6ff-xdtm2
namespace: kube-system
uid: REDACTED
ports:
- name: https
port: 8443
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-fg2th
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-n4p5l
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:33Z"
creationTimestamp: "2023-05-02T17:29:46Z"
generateName: rke2-coredns-rke2-coredns-
generation: 78
labels:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-coredns
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
helm.sh/chart: rke2-coredns-1.19.402
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
kubernetes.io/service-name: rke2-coredns-rke2-coredns
name: rke2-coredns-rke2-coredns-d7srf
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-coredns-rke2-coredns
uid: REDACTED
resourceVersion: "5534370"
uid: REDACTED
ports:
- name: tcp-53
port: 53
protocol: TCP
- name: udp-53
port: 53
protocol: UDP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-2h95m
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-c8x24
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-df4lk
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-8hvtl
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:23Z"
creationTimestamp: "2023-05-02T17:30:20Z"
generateName: rke2-ingress-nginx-controller-admission-
generation: 265
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-ingress-nginx
app.kubernetes.io/part-of: rke2-ingress-nginx
app.kubernetes.io/version: 1.6.4
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
helm.sh/chart: rke2-ingress-nginx-4.5.201
kubernetes.io/service-name: rke2-ingress-nginx-controller-admission
name: rke2-ingress-nginx-controller-admission-g25cm
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-ingress-nginx-controller-admission
uid: REDACTED
resourceVersion: "5534139"
uid: REDACTED
ports:
- appProtocol: https
name: https-webhook
port: 8443
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-metrics-server-7d58bbc9c6-xvgg8
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:09Z"
creationTimestamp: "2023-05-02T17:30:09Z"
generateName: rke2-metrics-server-
generation: 27
labels:
app: rke2-metrics-server
app.kubernetes.io/managed-by: Helm
chart: rke2-metrics-server-2.11.100-build2022101107
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
heritage: Helm
kubernetes.io/service-name: rke2-metrics-server
release: rke2-metrics-server
name: rke2-metrics-server-wmz2b
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-metrics-server
uid: REDACTED
resourceVersion: "5533128"
uid: REDACTED
ports:
- name: metrics
port: 10250
protocol: TCP
- name: https
port: 10250
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-snapshot-validation-webhook-7748dbf6ff-xdtm2
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:10Z"
creationTimestamp: "2023-05-02T17:30:10Z"
generateName: rke2-snapshot-validation-webhook-
generation: 16
labels:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-snapshot-validation-webhook
app.kubernetes.io/version: v6.2.1
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
helm.sh/chart: rke2-snapshot-validation-webhook-1.7.100
kubernetes.io/service-name: rke2-snapshot-validation-webhook
name: rke2-snapshot-validation-webhook-mzc9v
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-snapshot-validation-webhook
uid: REDACTED
resourceVersion: "5533125"
uid: REDACTED
ports:
- name: https
port: 8443
protocol: TCP
kind: List
metadata:
resourceVersion: ""
---
apiVersion: v1
items:
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-coredns
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:29:46Z"
labels:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-coredns
helm.sh/chart: rke2-coredns-1.19.402
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
name: rke2-coredns-rke2-coredns
namespace: kube-system
resourceVersion: "668"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: udp-53
port: 53
protocol: UDP
targetPort: 53
- name: tcp-53
port: 53
protocol: TCP
targetPort: 53
selector:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/name: rke2-coredns
k8s-app: kube-dns
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-ingress-nginx
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:30:20Z"
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-ingress-nginx
app.kubernetes.io/part-of: rke2-ingress-nginx
app.kubernetes.io/version: 1.6.4
helm.sh/chart: rke2-ingress-nginx-4.5.201
name: rke2-ingress-nginx-controller-admission
namespace: kube-system
resourceVersion: "1183"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- appProtocol: https
name: https-webhook
port: 443
protocol: TCP
targetPort: webhook
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/name: rke2-ingress-nginx
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-metrics-server
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:30:09Z"
labels:
app: rke2-metrics-server
app.kubernetes.io/managed-by: Helm
chart: rke2-metrics-server-2.11.100-build2022101107
heritage: Helm
release: rke2-metrics-server
name: rke2-metrics-server
namespace: kube-system
resourceVersion: "5197581"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
- name: metrics
port: 10250
protocol: TCP
targetPort: 10250
selector:
app: rke2-metrics-server
release: rke2-metrics-server
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rke2-snapshot-validation-webhook
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2023-05-02T17:30:10Z"
labels:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-snapshot-validation-webhook
app.kubernetes.io/version: v6.2.1
helm.sh/chart: rke2-snapshot-validation-webhook-1.7.100
name: rke2-snapshot-validation-webhook
namespace: kube-system
resourceVersion: "980"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/name: rke2-snapshot-validation-webhook
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:33Z"
creationTimestamp: "2023-05-02T17:29:46Z"
labels:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-coredns
helm.sh/chart: rke2-coredns-1.19.402
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
name: rke2-coredns-rke2-coredns
namespace: kube-system
resourceVersion: "5534372"
uid: REDACTED
subsets:
- addresses:
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-fg2th
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-n4p5l
namespace: kube-system
uid: REDACTED
ports:
- name: tcp-53
port: 53
protocol: TCP
- name: udp-53
port: 53
protocol: UDP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:23Z"
creationTimestamp: "2023-05-02T17:30:20Z"
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-ingress-nginx
app.kubernetes.io/part-of: rke2-ingress-nginx
app.kubernetes.io/version: 1.6.4
helm.sh/chart: rke2-ingress-nginx-4.5.201
name: rke2-ingress-nginx-controller-admission
namespace: kube-system
resourceVersion: "5534140"
uid: REDACTED
subsets:
- addresses:
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-2h95m
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-8hvtl
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-c8x24
namespace: kube-system
uid: REDACTED
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-df4lk
namespace: kube-system
uid: REDACTED
ports:
- appProtocol: https
name: https-webhook
port: 8443
protocol: TCP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:09Z"
creationTimestamp: "2023-05-02T17:30:09Z"
labels:
app: rke2-metrics-server
app.kubernetes.io/managed-by: Helm
chart: rke2-metrics-server-2.11.100-build2022101107
heritage: Helm
release: rke2-metrics-server
name: rke2-metrics-server
namespace: kube-system
resourceVersion: "5533133"
uid: REDACTED
subsets:
- addresses:
- ip: REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-metrics-server-7d58bbc9c6-xvgg8
namespace: kube-system
uid: REDACTED
ports:
- name: metrics
port: 10250
protocol: TCP
- name: https
port: 10250
protocol: TCP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:10Z"
creationTimestamp: "2023-05-02T17:30:10Z"
labels:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-snapshot-validation-webhook
app.kubernetes.io/version: v6.2.1
helm.sh/chart: rke2-snapshot-validation-webhook-1.7.100
name: rke2-snapshot-validation-webhook
namespace: kube-system
resourceVersion: "5533131"
uid: REDACTED
subsets:
- addresses:
- REDACTED
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-snapshot-validation-webhook-7748dbf6ff-xdtm2
namespace: kube-system
uid: REDACTED
ports:
- name: https
port: 8443
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-fg2th
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-coredns-rke2-coredns-6b9548f79f-n4p5l
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:33Z"
creationTimestamp: "2023-05-02T17:29:46Z"
generateName: rke2-coredns-rke2-coredns-
generation: 78
labels:
app.kubernetes.io/instance: rke2-coredns
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-coredns
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
helm.sh/chart: rke2-coredns-1.19.402
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
kubernetes.io/service-name: rke2-coredns-rke2-coredns
name: rke2-coredns-rke2-coredns-d7srf
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-coredns-rke2-coredns
uid: REDACTED
resourceVersion: "5534370"
uid: REDACTED
ports:
- name: tcp-53
port: 53
protocol: TCP
- name: udp-53
port: 53
protocol: UDP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-2h95m
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-c8x24
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-df4lk
namespace: kube-system
uid: REDACTED
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-ingress-nginx-controller-8hvtl
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-19T10:05:23Z"
creationTimestamp: "2023-05-02T17:30:20Z"
generateName: rke2-ingress-nginx-controller-admission-
generation: 265
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: rke2-ingress-nginx
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-ingress-nginx
app.kubernetes.io/part-of: rke2-ingress-nginx
app.kubernetes.io/version: 1.6.4
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
helm.sh/chart: rke2-ingress-nginx-4.5.201
kubernetes.io/service-name: rke2-ingress-nginx-controller-admission
name: rke2-ingress-nginx-controller-admission-g25cm
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-ingress-nginx-controller-admission
uid: REDACTED
resourceVersion: "5534139"
uid: REDACTED
ports:
- appProtocol: https
name: https-webhook
port: 8443
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-metrics-server-7d58bbc9c6-xvgg8
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:09Z"
creationTimestamp: "2023-05-02T17:30:09Z"
generateName: rke2-metrics-server-
generation: 27
labels:
app: rke2-metrics-server
app.kubernetes.io/managed-by: Helm
chart: rke2-metrics-server-2.11.100-build2022101107
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
heritage: Helm
kubernetes.io/service-name: rke2-metrics-server
release: rke2-metrics-server
name: rke2-metrics-server-wmz2b
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-metrics-server
uid: REDACTED
resourceVersion: "5533128"
uid: REDACTED
ports:
- name: metrics
port: 10250
protocol: TCP
- name: https
port: 10250
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- REDACTED
conditions:
ready: true
serving: true
terminating: false
nodeName: REDACTED
targetRef:
kind: Pod
name: rke2-snapshot-validation-webhook-7748dbf6ff-xdtm2
namespace: kube-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-02T17:30:10Z"
creationTimestamp: "2023-05-02T17:30:10Z"
generateName: rke2-snapshot-validation-webhook-
generation: 16
labels:
app.kubernetes.io/instance: rke2-snapshot-validation-webhook
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: rke2-snapshot-validation-webhook
app.kubernetes.io/version: v6.2.1
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
helm.sh/chart: rke2-snapshot-validation-webhook-1.7.100
kubernetes.io/service-name: rke2-snapshot-validation-webhook
name: rke2-snapshot-validation-webhook-mzc9v
namespace: kube-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rke2-snapshot-validation-webhook
uid: REDACTED
resourceVersion: "5533125"
uid: REDACTED
ports:
- name: https
port: 8443
protocol: TCP
kind: List
metadata:
resourceVersion:
Hi, I am having the same problem reported in this issue, and I noticed this only happens when the service name is too large, and it was introduced in this change: https://github.com/kubernetes/ingress-nginx/pull/8890 when migrating to endpointslices.
This error didn't happen with endpoints because the name of an endpoint is always the same as de service, but, the endpointslices are truncated when the name is too long, and the controller is trying to get the endpointslices with the service name, which doesn't match.
Example:
# kubectl get endpoints -n my-awesome-service | grep sensorgroup
my-awesome-service-telemetry-online-processor-dlc-sensorgroup 10.0.0.21:8080
# kubectl get EndpointSlice -n my-awesome-service | grep sensorgr
my-awesome-service-telemetry-online-processor-dlc-sensorgrn4mvj IPv4 8080 10.0.0.21 35d
I think this issue is related and could be the fix https://github.com/kubernetes/ingress-nginx/issues/9908
This would then indicate a fix has already been implemented? Also if it relates to the svc long name, why would this be happening to the "rancher" service .... which does not seem to be a long name ...
If its really about long names, then ;
Looks like the fix for long service names was fixed in this release https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.5.1
Thanks @longwuyuan
Thank you for the information, but if it was fixed then why are these issues still occurring? Do you have any idea why this is the case? Any feedback would be much appreciated!
@longwuyuan is this issue due to long service names? Is that why the services are being reported to not have an active endpoint?
Please see below the error logged for the rancher service, along with the endpointslice + prefix. Despite the 63 character limit when the prefix is added to the service name, the rancher endpointslice name is well under the 63 character limit ....
Service "cattle-system/rancher" does not have any active Endpoint.
Endpointslice name (service + prefix)
rancher-hkpgr
Are all services ignored due to the prefix being added to the endpointslice name? Or are the services being ignored for any endpointslice name that is over 63 characters?
Does anyone have any thoughts on this?
Forgot to mention that the services' endpoints/endpointslices are periodically recognized and function as expected. However, then randomly one service will throw a 404 error, resulting in the error "service does not have an active endpoint"; when the active endpoint exists.
Hi,
The data posted in this issue does not look like something that a developer can use to reproduce the problem. Any help on reproducing the problem is welcome.
Any data that is a complete coverage of the bad state, like logs combined with the output of kubectl describe ...
, of all related objects (controller, application ingress) components (pod, svc, ingrsss ep, epslices etc etc) , when this problem is actively in play, is also welcome.
@longwuyuan please see the requested, the only logging for this issue that is found is "Service cattle-system/rancher does not have any active Endpoint"
# kubectl -n cattle-system get svc,ing,ep,endpointslice
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/rancher ClusterIP REDACTED <none> 80/TCP,443/TCP 25h
service/rancher-webhook ClusterIP REDACTED <none> 443/TCP 24h
service/webhook-service ClusterIP REDACTED <none> 443/TCP 24h
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress.networking.k8s.io/rancher <none> REDACTED REDACTED 80, 443 25h
NAME ENDPOINTS AGE
endpoints/rancher HOST1:80,HOST2:80,HOST3:80 + 3 more... 25h
endpoints/rancher-webhook HOST1:9443 24h
endpoints/webhook-service HOST1:8777 24h
NAME ADDRESSTYPE PORTS ENDPOINTS AGE
endpointslice.discovery.k8s.io/rancher-hkpgr IPv4 80,444 HOST2,HOST3,HOST1 25h
endpointslice.discovery.k8s.io/rancher-webhook-sfgns IPv4 9443 HOST1 24h
endpointslice.discovery.k8s.io/webhook-service-b4s92 IPv4 8777 HOST1 24h
---
# kubectl -n cattle-system get svc,ing,ep,endpointslice -o yaml
apiVersion: v1
items:
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rancher
meta.helm.sh/release-namespace: cattle-system
creationTimestamp: "2023-05-22T15:11:46Z"
labels:
app: rancher
app.kubernetes.io/managed-by: Helm
chart: rancher-2.7.3
heritage: Helm
release: rancher
name: rancher
namespace: cattle-system
resourceVersion: "5250"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
- name: https-internal
port: 443
protocol: TCP
targetPort: 444
selector:
app: rancher
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rancher-webhook
meta.helm.sh/release-namespace: cattle-system
creationTimestamp: "2023-05-22T15:17:43Z"
labels:
app.kubernetes.io/managed-by: Helm
name: rancher-webhook
namespace: cattle-system
resourceVersion: "9776"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: https
port: 443
protocol: TCP
targetPort: 9443
selector:
app: rancher-webhook
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: rancher-webhook
meta.helm.sh/release-namespace: cattle-system
need-a-cert.cattle.io/secret-name: rancher-webhook-tls
creationTimestamp: "2023-05-22T15:17:43Z"
labels:
app.kubernetes.io/managed-by: Helm
name: webhook-service
namespace: cattle-system
resourceVersion: "9772"
uid: REDACTED
spec:
clusterIP: REDACTED
clusterIPs:
- REDACTED
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: https
port: 443
protocol: TCP
targetPort: 8777
selector:
app: rancher-webhook
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
field.cattle.io/publicEndpoints: '[{"addresses":["WORKER1","CONTROLPLANE","WORKER3","WORKER2"],"port":443,"protocol":"HTTPS","serviceName":"cattle-system:rancher","ingressName":"cattle-system:rancher","hostname":"CONTROLPLANE-HOSTNAME","allNodes":false}]'
meta.helm.sh/release-name: rancher
meta.helm.sh/release-namespace: cattle-system
nginx.ingress.kubernetes.io/proxy-connect-timeout: "30"
nginx.ingress.kubernetes.io/proxy-read-timeout: "1800"
nginx.ingress.kubernetes.io/proxy-send-timeout: "1800"
creationTimestamp: "2023-05-22T15:11:46Z"
generation: 1
labels:
app: rancher
app.kubernetes.io/managed-by: Helm
chart: rancher-2.7.3
heritage: Helm
release: rancher
name: rancher
namespace: cattle-system
resourceVersion: "301991"
uid: REDACTED
spec:
rules:
- host: CONTROLPLANE-HOSTNAME
http:
paths:
- backend:
service:
name: rancher
port:
number: 80
pathType: ImplementationSpecific
tls:
- hosts:
- CONTROLPLANE-HOSTNAME
secretName: tls-rancher-ingress
status:
loadBalancer:
ingress:
- ip: WORKER1
- ip: CONTROLPLANE
- ip: WORKER3
- ip: WORKER2
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-23T10:10:55Z"
creationTimestamp: "2023-05-22T15:11:46Z"
labels:
app: rancher
app.kubernetes.io/managed-by: Helm
chart: rancher-2.7.3
heritage: Helm
release: rancher
name: rancher
namespace: cattle-system
resourceVersion: "301212"
uid: REDACTED
subsets:
- addresses:
- ip: HOST1
nodeName: CONTROLPLANE
targetRef:
kind: Pod
name: rancher-6b4977f897-jrzjx
namespace: cattle-system
uid: REDACTED
- ip: HOST2
nodeName: WORKER2
targetRef:
kind: Pod
name: rancher-6b4977f897-6sf47
namespace: cattle-system
uid: REDACTED
- ip: HOST3
nodeName: WORKER3
targetRef:
kind: Pod
name: rancher-6b4977f897-xx8gf
namespace: cattle-system
uid: REDACTED
ports:
- name: http
port: 80
protocol: TCP
- name: https-internal
port: 444
protocol: TCP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-23T10:10:06Z"
creationTimestamp: "2023-05-22T15:17:44Z"
labels:
app.kubernetes.io/managed-by: Helm
name: rancher-webhook
namespace: cattle-system
resourceVersion: "300547"
uid: REDACTED
subsets:
- addresses:
- ip: HOST1
nodeName: WORKER3
targetRef:
kind: Pod
name: rancher-webhook-656cd8b9f-cbjbw
namespace: cattle-system
uid: REDACTED
ports:
- name: https
port: 9443
protocol: TCP
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-23T10:10:06Z"
creationTimestamp: "2023-05-22T15:17:44Z"
labels:
app.kubernetes.io/managed-by: Helm
name: webhook-service
namespace: cattle-system
resourceVersion: "300546"
uid: REDACTED
subsets:
- addresses:
- ip: HOST1
nodeName: WORKER3
targetRef:
kind: Pod
name: rancher-webhook-656cd8b9f-cbjbw
namespace: cattle-system
uid: REDACTED
ports:
- name: https
port: 8777
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- HOST2
conditions:
ready: true
serving: true
terminating: false
nodeName: WORKER2
targetRef:
kind: Pod
name: rancher-6b4977f897-6sf47
namespace: cattle-system
uid: REDACTED
- addresses:
- HOST3
conditions:
ready: true
serving: true
terminating: false
nodeName: WORKER3
targetRef:
kind: Pod
name: rancher-6b4977f897-xx8gf
namespace: cattle-system
uid: REDACTED
- addresses:
- HOST1
conditions:
ready: true
serving: true
terminating: false
nodeName: CONTROLPLANE
targetRef:
kind: Pod
name: rancher-6b4977f897-jrzjx
namespace: cattle-system
uid: REDACTED
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-23T10:10:55Z"
creationTimestamp: "2023-05-22T15:11:46Z"
generateName: rancher-
generation: 20
labels:
app: rancher
app.kubernetes.io/managed-by: Helm
chart: rancher-2.7.3
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
heritage: Helm
kubernetes.io/service-name: rancher
release: rancher
name: rancher-hkpgr
namespace: cattle-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rancher
uid: REDACTED
resourceVersion: "301213"
uid: REDACTED
ports:
- name: http
port: 80
protocol: TCP
- name: https-internal
port: 444
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- HOST1
conditions:
ready: true
serving: true
terminating: false
nodeName: WORKER3
targetRef:
kind: Pod
name: rancher-webhook-656cd8b9f-cbjbw
namespace: cattle-system
uid: REDACTED
kind: EndpointSlice
metadata:
creationTimestamp: "2023-05-22T15:17:44Z"
generateName: rancher-webhook-
generation: 6
labels:
app.kubernetes.io/managed-by: Helm
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
kubernetes.io/service-name: rancher-webhook
name: rancher-webhook-sfgns
namespace: cattle-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: rancher-webhook
uid: REDACTED
resourceVersion: "300903"
uid: REDACTED
ports:
- name: https
port: 9443
protocol: TCP
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- HOST1
conditions:
ready: true
serving: true
terminating: false
nodeName: WORKER3
targetRef:
kind: Pod
name: rancher-webhook-656cd8b9f-cbjbw
namespace: cattle-system
uid: REDACTED
kind: EndpointSlice
metadata:
creationTimestamp: "2023-05-22T15:17:44Z"
generateName: webhook-service-
generation: 6
labels:
app.kubernetes.io/managed-by: Helm
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
kubernetes.io/service-name: webhook-service
name: webhook-service-b4s92
namespace: cattle-system
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: webhook-service
uid: REDACTED
resourceVersion: "300904"
uid: REDACTED
ports:
- name: https
port: 8777
protocol: TCP
kind: List
metadata:
resourceVersion: ""
When it errors out with the 404 page not found the following is logged in the "rke2-ingress-nginx-controller-
I0523 10:03:33.790084 7 store.go:433] "Found valid IngressClass" ingress="cattle-system/rancher" ingressclass="_"
W0523 10:04:21.542696 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:24.876046 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:32.725971 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:36.060111 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:39.393421 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:42.726233 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:46.059925 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:49.392749 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:52.726522 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:56.059866 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:04:59.393704 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:05:02.726128 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint.
W0523 10:05:06.060042 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint
It logs the same error above for each service that periodically times out.
@rdb0101 your latest post above is one example of not having data to analyse or reproduce.
To be precise, if someone can post the logs of controllerpod and also the output of kubectl get endpointslices -n cattle-system
, while the problem is live, then the timestamp on log message and the output of kubectl can be co-related. Other info that will provide info there is kubectl -n cattle-system get events
.
If you post kubectl get po -n cattle-system
, you can see the restarts, if any.
If you see logs of rancher pod, you could see rancher events and check if any are related.
In any case I don't think any developer can reproduce this problem, with the information currently posted in this issue.
@longwuyuan Thank you for clarifying what data is needed in order to provide a reproducable problem. Please see below the errors that show what happens when the rancher service goes from having no active endpoint, to restart the ingress:
controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint. W0523 10:05:19.393530 7 controller.go:1163] Service "cattle-system/rancher" does not have any active Endpoint. I0523 10:10:08.404944 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"cattle-system", Name:"rancher", UID:"REDACTED", APIVersion:"networking.k8s.io/v1", ResourceVersion:"300576", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync I0523 10:12:09.170841 7 status.go:300] "updating Ingress status" namespace="cattle-system" ingress="rancher" currentValue=[{IP: CONTROLPLANE Hostname: Ports:[]} {IP:WORKER3 Hostname: Ports:[]} {IP:WORKER2 Hostname: Ports:[]}] newValue=[{IP:WORKER1 Hostname: Ports:[]} {IP: CONTROLPLANE Hostname: Ports:[]} {IP:WORKER3 Hostname: Ports:[]} {IP:WORKER2 Hostname: Ports:[]}]
Please note that this problem is reproducable by setting up rke2 with helm install of rancher 2.7.3 This exact issue occurs even in a minimum install.
@rdb0101 I am sorry you are having this issue and I hope it resolves sooner. Here are my thoughts and I hope you see the practical side of an issue being created here in this project.
If there is a bug/problem in the controller code, then it will occur on even non rancher deployment like a deployment created using --image nginx:alpine. So If the problem is specific to rancher, then you should talk to the Rancher forum. They have slack as well as Github project
Currently, I guess if you get info from a live outage as I describe below, you can help others to know where to look for cause;
Hi @longwuyuan thanks very much for your feedback. I used rancher just as an example. However, it is not specific to just rancher, this issue impacts all of the services I have deployed. I used rancher as an example; as the service + prefix for the endpointslicename is under the 63 character limit. I was trying to determine as to how or whether the nginx-controller was filtering out even the rancher service name, despite being well under the limit. I apologize again if my feedback was unclear. If this issue was specific to just rancher then it would likely only impact the rancher service correct?
Correct.
I am using v1.7.1 of the controller with TLS and I don't face tis problem. Can you try to reproduce the problem in minikube using image nginx:alpine for creating deployment and exposing using ingress-nginx controller and metalllb.
@longwuyuan Thanks very much for the feedback. I will go ahead and stand up minikube with the version and image as recommended. I will provide the output once I have reproduced the issue.
@longwuyuan Is your current environment multi-node as well?
no
okay thank you for verifying
@rdb0101 @longwuyuan i can confirm that i have the exact same issue , read this thread thoroughly and seems like community needs help in capture live logs , which i am facing currently. Here are they
Kubernetes Cluster : Digital Ocean Managed , version v1.26.3
NGINX Ingress controller
Release: v1.7.1
Build: f48b03be54031491e78472bcf3aa026a81e1ffd3
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6
ingress-nginx Chart Version 4.6.1
$ kubectl -n kratos-staging get svc,ing,ep,endpointslice
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kratos-service ClusterIP 10.245.73.194 <none> 4433/TCP,4434/TCP 62m
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress.networking.k8s.io/api-ingress <none> accounts.example.in 1xx.13x.122.209 80 49m
NAME ENDPOINTS AGE
endpoints/kratos-service <none> 62m
NAME ADDRESSTYPE PORTS ENDPOINTS AGE
endpointslice.discovery.k8s.io/kratos-service-jkb94 IPv4 <unset> <unset> 62m
$ kubectl -n kratos-staging get svc,ing,ep,endpointslice -o yaml
apiVersion: v1
items:
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2023-05-29T17:50:03Z"
labels:
app: kratos
name: kratos-service
namespace: kratos-staging
resourceVersion: "107055"
uid: ad9f6739-f132-4678-8a56-0d4ca3f679ff
spec:
clusterIP: 10.245.73.194
clusterIPs:
- 10.245.73.194
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http-public
port: 4433
protocol: TCP
targetPort: 4433
- name: http-admin
port: 4434
protocol: TCP
targetPort: 4434
selector:
app: kratos
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/enable-cors: "true"
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/ssl-redirect: "false"
creationTimestamp: "2023-05-29T18:02:32Z"
generation: 1
name: api-ingress
namespace: kratos-staging
resourceVersion: "110037"
uid: 18832e6c-fc36-4440-a482-6217078d2c6a
spec:
rules:
- host: accounts.example.in
http:
paths:
- backend:
service:
name: kratos-service
port:
number: 4433
path: /kratos(/|$)(.*)
pathType: Prefix
status:
loadBalancer:
ingress:
- ip: 1xx.1xx.122.209
- apiVersion: v1
kind: Endpoints
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-29T17:50:03Z"
creationTimestamp: "2023-05-29T17:50:03Z"
labels:
app: kratos
name: kratos-service
namespace: kratos-staging
resourceVersion: "107056"
uid: a53b4461-b759-44e5-af3e-7a9322056eac
- addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints: null
kind: EndpointSlice
metadata:
annotations:
endpoints.kubernetes.io/last-change-trigger-time: "2023-05-29T17:50:03Z"
creationTimestamp: "2023-05-29T17:50:03Z"
generateName: kratos-service-
generation: 1
labels:
app: kratos
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
kubernetes.io/service-name: kratos-service
name: kratos-service-jkb94
namespace: kratos-staging
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: kratos-service
uid: ad9f6739-f132-4678-8a56-0d4ca3f679ff
resourceVersion: "107057"
uid: d2094f71-d8ea-4e20-a225-25f035f64b6b
ports: null
kind: List
metadata:
resourceVersion: ""
$ kubectl logs nginx-ingress-ingress-nginx-controller-fd49fcc58-zjb2j -n default -f
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v1.7.1
Build: f48b03be54031491e78472bcf3aa026a81e1ffd3
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6
-------------------------------------------------------------------------------
W0529 18:06:15.955460 8 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0529 18:06:15.955660 8 main.go:209] "Creating API client" host="https://10.245.0.1:443"
I0529 18:06:15.970169 8 main.go:253] "Running in Kubernetes cluster" major="1" minor="26" git="v1.26.3" state="clean" commit="9e644106593f3f4aa98f8a84b23db5fa378900bd" platform="linux/amd64"
I0529 18:06:16.113949 8 main.go:104] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem"
I0529 18:06:16.142720 8 ssl.go:533] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key"
I0529 18:06:16.157955 8 nginx.go:261] "Starting NGINX Ingress controller"
I0529 18:06:16.167843 8 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"default", Name:"nginx-ingress-ingress-nginx-controller", UID:"8c5564f0-82f6-4481-84f1-96deae0cf56c", APIVersion:"v1", ResourceVersion:"105147", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap default/nginx-ingress-ingress-nginx-controller
I0529 18:06:17.263997 8 store.go:433] "Found valid IngressClass" ingress="kratos-staging/api-ingress" ingressclass="nginx"
I0529 18:06:17.264673 8 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kratos-staging", Name:"api-ingress", UID:"18832e6c-fc36-4440-a482-6217078d2c6a", APIVersion:"networking.k8s.io/v1", ResourceVersion:"110037", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
I0529 18:06:17.360198 8 nginx.go:304] "Starting NGINX process"
I0529 18:06:17.360289 8 leaderelection.go:248] attempting to acquire leader lease default/nginx-ingress-ingress-nginx-leader...
I0529 18:06:17.360770 8 nginx.go:324] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
W0529 18:06:17.361023 8 controller.go:1152] Service "kratos-staging/kratos-service" does not have any active Endpoint.
I0529 18:06:17.361349 8 controller.go:190] "Configuration changes detected, backend reload required"
I0529 18:06:17.372893 8 status.go:84] "New leader elected" identity="nginx-ingress-ingress-nginx-controller-fd49fcc58-qt6tt"
I0529 18:06:17.471508 8 controller.go:207] "Backend successfully reloaded"
I0529 18:06:17.471812 8 controller.go:218] "Initial sync, sleeping for 1 second"
I0529 18:06:17.471937 8 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"nginx-ingress-ingress-nginx-controller-fd49fcc58-zjb2j", UID:"55dbe193-a944-4cdd-acf5-07ffe64fac06", APIVersion:"v1", ResourceVersion:"110912", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
W0529 18:06:21.171598 8 controller.go:1152] Service "kratos-staging/kratos-service" does not have any active Endpoint.
W0529 18:06:25.542628 8 controller.go:1152] Service "kratos-staging/kratos-service" does not have any active Endpoint.
W0529 18:06:28.876887 8 controller.go:1152] Service "kratos-staging/kratos-service" does not have any active Endpoint.
W0529 18:06:32.209528 8 controller.go:1152] Service "kratos-staging/kratos-service" does not have any active Endpoint.
I0529 18:06:59.957867 8 status.go:84] "New leader elected" identity="nginx-ingress-ingress-nginx-controller-fd49fcc58-zjb2j"
I0529 18:06:59.957834 8 leaderelection.go:258] successfully acquired lease default/nginx-ingress-ingress-nginx-leader
10.244.0.50 - - [29/May/2023:18:07:49 +0000] "GET /ui/welcome HTTP/2.0" 404 146 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 287 0.001 [upstream-default-backend] [] 127.0.0.1:8181 146 0.001 404 20a53ca37e3a44a9942e79b5f7d09594
10.244.0.50 - - [29/May/2023:18:07:59 +0000] "GET / HTTP/2.0" 404 146 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 16 0.001 [upstream-default-backend] [] 127.0.0.1:8181 146 0.001 404 ef64dc8200a573ead9dd21852aca603b
10.244.0.50 - - [29/May/2023:18:08:40 +0000] "GET / HTTP/1.1" 404 146 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 356 0.001 [upstream-default-backend] [] 127.0.0.1:8181 146 0.000 404 d17b9a4226f85900de13a36efe5d3ba3
10.244.0.50 - - [29/May/2023:18:08:40 +0000] "GET /favicon.ico HTTP/1.1" 404 146 "http://accounts.example.in/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 314 0.001 [upstream-default-backend] [] 127.0.0.1:8181 146 0.000 404 ac6dd100945f51f6a71bcf531ed6809f
10.244.0.50 - - [29/May/2023:18:08:47 +0000] "GET /ui/welcome HTTP/2.0" 404 146 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 23 0.001 [upstream-default-backend] [] 127.0.0.1:8181 146 0.000 404 44a30be3e49357c7294fa9866aea1c98
10.244.0.50 - - [29/May/2023:18:09:50 +0000] "GET /ui/welcome HTTP/2.0" 404 146 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 23 0.000 [upstream-default-backend] [] 127.0.0.1:8181 146 0.001 404 f85294bcfd5a8d0c447d21aae0576c26
10.244.0.50 - - [29/May/2023:18:10:56 +0000] "m\xEB\xC7~0\xC1\xB3\xACtQ\xB6\xE0q\x9E\x19\xBA" 400 150 "-" "-" 0 0.010 [] [] - - - - 28285b6bd2b7568e6a538a1e496a1f5d
2023/05/29 18:11:00 [crit] 27#27: *2471 SSL_do_handshake() failed (SSL: error:0A00006C:SSL routines::bad key share) while SSL handshaking, client: 10.244.0.50, server: 0.0.0.0:443
10.244.0.50 - - [29/May/2023:18:11:01 +0000] "GET / HTTP/1.1" 400 650 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" 206 0.000 [] [] - - - - ddfafb94bdf0ebc378187757913e58df
10.244.0.50 - - [29/May/2023:18:11:01 +0000] "GET /private/api/v1/service/premaster HTTP/1.1" 400 650 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" 238 0.000 [] [] - - - - 3969bf23609b416674790b5922f9c070
10.244.0.50 - - [29/May/2023:18:22:05 +0000] "\x03\x00\x00/*\xE0\x00\x00\x00\x00\x00Cookie: mstshash=Administr" 400 150 "-" "-" 0 0.166 [] [] - - - - 02c371c8d08e63424878349b644715e7
10.244.0.50 - - [29/May/2023:18:25:41 +0000] "CONNECT checkip.amazonaws.com:443 HTTP/1.1" 400 150 "-" "-" 0 4.530 [] [] - - - - 4c21eccbb53aad215f16d7e79e2a8576
10.244.0.50 - - [29/May/2023:18:25:42 +0000] "\x04\x01\x00P\x22\xFF\xAD\xC20\x00" 400 150 "-" "-" 0 0.426 [] [] - - - - 52d309a9de55421b0363061d0ee36a0d
10.244.0.50 - - [29/May/2023:18:25:52 +0000] "\x05\x01\x00" 400 150 "-" "-" 0 0.407 [] [] - - - - 19b8631ce2a32f3cbe233241e73c88c4
10.244.0.50 - - [29/May/2023:18:30:52 +0000] "GET /ui/welcome HTTP/2.0" 404 146 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 287 0.001 [upstream-default-backend] [] 127.0.0.1:8181 146 0.001 404 c77db22e17b9876b2243bbf21ad10eb6
10.244.0.50 - - [29/May/2023:18:30:52 +0000] "GET /favicon.ico HTTP/2.0" 499 0 "https://accounts.example.in/ui/welcome" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0" 94 0.000 [upstream-default-backend] [] 127.0.0.1:8181 0 0.000 - 526324dfe16207dc99b228c930b641db
10.244.0.50 - - [29/May/2023:18:49:36 +0000] "CONNECT www.yahoo.com:443 HTTP/1.1" 400 150 "-" "-" 0 0.146 [] [] - - - - 9d18b470b8515b04e59649ff66790b22
10.244.0.50 - - [29/May/2023:19:05:19 +0000] "\x16\x03\x00\x00i\x01\x00\x00e\x03\x03U\x1C\xA7\xE4random1random2random3random4\x00\x00\x0C\x00/\x00" 400 150 "-" "-" 0 0.153 [] [] - - - - 6717ee7d921b5b078441421c7d18173f
@longwuyuan do you think i should try downgrading the helm chart version ? if yes which version should i try ? any suggestion
@ksingh7 helpful that you intended to provide info.
Glad that provided info has some relevance.
NAME ADDRESSTYPE PORTS ENDPOINTS AGE
endpointslice.discovery.k8s.io/kratos-service-jkb94 IPv4 <unset> <unset> 62m
No helpful analysis possible because info is incomplete
kubectl -n kratos-staging describe svc kratos-service
was needed when problem happenedkubectl -n kratos-staging get po -o wide
was neededkubectl -n kratos-staging describe po
was neededLogs show another issue. You have bad TLS configuration
SSL_do_handshake() failed (SSL: error:0A00006C:SSL routines::bad key share) while SSL handshaking
@longwuyuan this is a blocker issue for us hence trying to provide you more information as you requested
SSL_do_handshake() failed (SSL: error:0A00006C:SSL routines::bad key share) while SSL handshaking
$ kubectl -n kratos-staging describe svc kratos-service
Name: kratos-service
Namespace: kratos-staging
Labels: app=kratos
Annotations: <none>
Selector: app=kratos
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.245.73.194
IPs: 10.245.73.194
Port: http-public 4433/TCP
TargetPort: 4433/TCP
Endpoints: 10.244.0.98:4433
Port: http-admin 4434/TCP
TargetPort: 4434/TCP
Endpoints: 10.244.0.98:4434
Session Affinity: None
Events: <none>
$ kubectl -n kratos-staging get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kratos-5bd7697c6b-w4srk 1/1 Running 0 10h 10.244.0.98 pool-h4t86o44q-fen7g <none> <none>
nginx-ingress-ingress-nginx-controller-5c6c7cfb8-7qrm4 1/1 Running 0 13m 10.244.0.32 pool-h4t86o44q-fen7g <none> <none>
$
$ kubectl -n kratos-staging describe po
Name: kratos-5bd7697c6b-w4srk
Namespace: kratos-staging
Priority: 0
Service Account: default
Node: pool-h4t86o44q-fen7g/10.122.0.2
Start Time: Tue, 30 May 2023 01:12:10 +0530
Labels: app=kratos
pod-template-hash=5bd7697c6b
Annotations: <none>
Status: Running
IP: 10.244.0.98
IPs:
IP: 10.244.0.98
Controlled By: ReplicaSet/kratos-5bd7697c6b
Containers:
kratos:
Container ID: containerd://f008afa67b5c9952229c61e0812e2439c5a6ec215f5910619493c3936a2de6f6
Image: oryd/kratos
Image ID: docker.io/oryd/kratos@sha256:5ec9808accebd4826b15b21bc6bcaa4410d3dc451ebe2bf8da812042df046ceb
Ports: 4433/TCP, 4434/TCP
Host Ports: 0/TCP, 0/TCP
Command:
kratos
-c
/etc/config/kratos/kratos.yml
serve
State: Running
Started: Tue, 30 May 2023 01:12:11 +0530
Ready: True
Restart Count: 0
Limits:
cpu: 500m
memory: 128Mi
Requests:
cpu: 500m
memory: 128Mi
Environment Variables from:
kratos-env Secret Optional: false
Environment: <none>
Mounts:
/etc/config/identity.schema.json from kratos-identity-schema (rw,path="identity.schema.json")
/etc/config/kratos/kratos.yml from kratos-config (rw,path="kratos.yml")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bk8jr (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kratos-identity-schema:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: identity-schema-config
Optional: false
kratos-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kratos-config
Optional: false
kube-api-access-bk8jr:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
Name: nginx-ingress-ingress-nginx-controller-5c6c7cfb8-7qrm4
Namespace: kratos-staging
Priority: 0
Service Account: nginx-ingress-ingress-nginx
Node: pool-h4t86o44q-fen7g/10.122.0.2
Start Time: Tue, 30 May 2023 11:30:26 +0530
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=nginx-ingress
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.7.1
helm.sh/chart=ingress-nginx-4.6.1
pod-template-hash=5c6c7cfb8
Annotations: <none>
Status: Running
IP: 10.244.0.32
IPs:
IP: 10.244.0.32
Controlled By: ReplicaSet/nginx-ingress-ingress-nginx-controller-5c6c7cfb8
Containers:
controller:
Container ID: containerd://1b5bf2e1ffc5582cfc4e40f53401b7868cc081a3e8fdeb062b0cd3a92d51920b
Image: registry.k8s.io/ingress-nginx/controller:v1.7.1@sha256:7244b95ea47bddcb8267c1e625fb163fc183ef55448855e3ac52a7b260a60407
Image ID: registry.k8s.io/ingress-nginx/controller@sha256:7244b95ea47bddcb8267c1e625fb163fc183ef55448855e3ac52a7b260a60407
Ports: 80/TCP, 443/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--publish-service=$(POD_NAMESPACE)/nginx-ingress-ingress-nginx-controller
--election-id=nginx-ingress-ingress-nginx-leader
--controller-class=k8s.io/ingress-nginx
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/nginx-ingress-ingress-nginx-controller
--watch-namespace=kratos-staging
--validating-webhook=:8443
--validating-webhook-certificate=/usr/local/certificates/cert
--validating-webhook-key=/usr/local/certificates/key
State: Running
Started: Tue, 30 May 2023 11:30:27 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 90Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: nginx-ingress-ingress-nginx-controller-5c6c7cfb8-7qrm4 (v1:metadata.name)
POD_NAMESPACE: kratos-staging (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/usr/local/certificates/ from webhook-cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gmgbk (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
webhook-cert:
Type: Secret (a volume populated by a Secret)
SecretName: nginx-ingress-ingress-nginx-admission
Optional: false
kube-api-access-gmgbk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned kratos-staging/nginx-ingress-ingress-nginx-controller-5c6c7cfb8-7qrm4 to pool-h4t86o44q-fen7g
Normal Pulled 14m kubelet Container image "registry.k8s.io/ingress-nginx/controller:v1.7.1@sha256:7244b95ea47bddcb8267c1e625fb163fc183ef55448855e3ac52a7b260a60407" already present on machine
Normal Created 14m kubelet Created container controller
Normal Started 14m kubelet Started container controller
Normal RELOAD 4m53s (x2 over 14m) nginx-ingress-controller NGINX reload triggered due to a change in configuration
$
@ksingh7 thank you for trying to work towards finding out the cause of this issue. However, I agree with @longwuyuan that the issue you have presented is related to invalid tls certificate configuration. The error could be from the fact it does not have a valid tls certificate; although I am not entirely sure; you need to check what secret your service using and what secret the nginx ingress-controller is using.
I have been experiencing this issue for the last couple of weeks on muliple clusters, the environment does not have TLS configured, nginx install using helm
AKS 1.26.3 Ngnix 4.6.1
kubectl get svc,ing,ep,endpointslice -n bmdev-ne-linker
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/bmdev-ne-linker-service ClusterIP 172.16.252.129
NAME CLASS HOSTS ADDRESS PORTS AGE ingress.networking.k8s.io/bmdev-ne-linker-ingress nginx linker-dev.***.com 10.1.0.5 80 85m
NAME ENDPOINTS AGE endpoints/bmdev-ne-linker-service 10.2.0.57:80 85m
NAME ADDRESSTYPE PORTS ENDPOINTS AGE endpointslice.discovery.k8s.io/bmdev-ne-linker-service-xmtlb IPv4 80 10.2.0.57 85m
kubectl get svc,ing,ep,endpointslice -n bmdev-ne-linker -o yaml
apiVersion: v1 items:
addressType: IPv4 apiVersion: discovery.k8s.io/v1 endpoints:
kubectl logs nginx-ingress-ingress-nginx-controller-6679b95c85-2zb6l -n bmdev-ne-nginx
NGINX Ingress controller Release: v1.7.1 Build: f48b03be54031491e78472bcf3aa026a81e1ffd3 Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.21.6
W0601 08:40:02.319786 7 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. I0601 08:40:02.319902 7 main.go:209] "Creating API client" host="https://172.16.0.1:443" I0601 08:40:02.344452 7 main.go:253] "Running in Kubernetes cluster" major="1" minor="26" git="v1.26.3" state="clean" commit="9e644106593f3f4aa98f8a84b23db5fa378900bd" platform="linux/amd64" I0601 08:40:02.571681 7 main.go:104] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem" I0601 08:40:02.590743 7 ssl.go:533] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key" I0601 08:40:02.599475 7 nginx.go:261] "Starting NGINX Ingress controller" I0601 08:40:02.607580 7 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"bmdev-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller", UID:"d3046098-f6e9-4864-a1cd-544f102e6c96", APIVersion:"v1", ResourceVersion:"13283", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap bmdev-ne-nginx/nginx-ingress-ingress-nginx-controller I0601 08:40:03.801497 7 nginx.go:304] "Starting NGINX process" I0601 08:40:03.801587 7 leaderelection.go:248] attempting to acquire leader lease bmdev-ne-nginx/nginx-ingress-ingress-nginx-leader... I0601 08:40:03.801800 7 nginx.go:324] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key" I0601 08:40:03.801915 7 controller.go:190] "Configuration changes detected, backend reload required" I0601 08:40:03.815509 7 leaderelection.go:258] successfully acquired lease bmdev-ne-nginx/nginx-ingress-ingress-nginx-leader I0601 08:40:03.815654 7 status.go:84] "New leader elected" identity="nginx-ingress-ingress-nginx-controller-6679b95c85-2zb6l" I0601 08:40:03.842223 7 controller.go:207] "Backend successfully reloaded" I0601 08:40:03.842397 7 controller.go:218] "Initial sync, sleeping for 1 second" I0601 08:40:03.842459 7 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"bmdev-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller-6679b95c85-2zb6l", UID:"8aa24f23-825a-4c36-ac01-664c471cdec9", APIVersion:"v1", ResourceVersion:"13317", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration W0601 08:47:34.889217 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint. I0601 08:47:34.911404 7 admission.go:149] processed ingress via admission controller {testedIngressLength:1 testedIngressTime:0.022s renderingIngressLength:1 renderingIngressTime:0s admissionTime:18.4kBs testedConfigurationSize:0.023} I0601 08:47:34.911428 7 main.go:100] "successfully validated configuration, accepting" ingress="bmdev-ne-linker/bmdev-ne-linker-ingress" I0601 08:47:34.915890 7 store.go:433] "Found valid IngressClass" ingress="bmdev-ne-linker/bmdev-ne-linker-ingress" ingressclass="nginx" I0601 08:47:34.916047 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"bmdev-ne-linker", Name:"bmdev-ne-linker-ingress", UID:"2dda1d8a-5d2c-4fc4-a29c-1d7c6484fc75", APIVersion:"networking.k8s.io/v1", ResourceVersion:"16531", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync W0601 08:47:37.969723 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint. I0601 08:47:37.969820 7 controller.go:190] "Configuration changes detected, backend reload required" I0601 08:47:38.013711 7 controller.go:207] "Backend successfully reloaded" I0601 08:47:38.013900 7 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"bmdev-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller-6679b95c85-2zb6l", UID:"8aa24f23-825a-4c36-ac01-664c471cdec9", APIVersion:"v1", ResourceVersion:"13317", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration I0601 08:48:03.824349 7 status.go:300] "updating Ingress status" namespace="bmdev-ne-linker" ingress="bmdev-ne-linker-ingress" currentValue=[] newValue=[{IP:10.1.0.5 Hostname: Ports:[]}] I0601 08:48:03.829844 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"bmdev-ne-linker", Name:"bmdev-ne-linker-ingress", UID:"2dda1d8a-5d2c-4fc4-a29c-1d7c6484fc75", APIVersion:"networking.k8s.io/v1", ResourceVersion:"16769", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync I0601 10:02:01.269063 7 admission.go:149] processed ingress via admission controller {testedIngressLength:1 testedIngressTime:0.023s renderingIngressLength:1 renderingIngressTime:0s admissionTime:18.4kBs testedConfigurationSize:0.023} I0601 10:02:01.269089 7 main.go:100] "successfully validated configuration, accepting" ingress="bmdev-ne-linker/bmdev-ne-linker-ingress" I0601 10:02:01.274422 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"bmdev-ne-linker", Name:"bmdev-ne-linker-ingress", UID:"2dda1d8a-5d2c-4fc4-a29c-1d7c6484fc75", APIVersion:"networking.k8s.io/v1", ResourceVersion:"47332", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync I0601 10:02:01.274632 7 controller.go:190] "Configuration changes detected, backend reload required" I0601 10:02:01.318014 7 controller.go:207] "Backend successfully reloaded" I0601 10:02:01.318192 7 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"bmdev-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller-6679b95c85-2zb6l", UID:"8aa24f23-825a-4c36-ac01-664c471cdec9", APIVersion:"v1", ResourceVersion:"13317", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration W0601 10:05:49.745387 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint. W0601 10:05:53.079353 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint. W0601 10:05:56.413340 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint.
@matthewbrumpton From your data, its helpful to know that the controller logged this error message
10:05:49.745387 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint.
but the tracking of the problem ends there because the output of kubectl describe ing bmdev-ne-linker-ingress
was required to be taken at the timestamp of the error message, and posted here for co-relating error message to object state. The endpoint ipaddresses get displayed as part of the ingress describe output.
Also, I am not sure if routing breaks at the time this error message is logged in the controller pod. Do the HTTP/HTTPS request to the URL for which that ingress's rules match work when this error message is logged (response code 200) ?
@longwuyuan Error message logs in controller when the bmdev-ne-linker pod starts up, not on http request
W0601 10:54:49.605701 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint. W0601 10:54:52.939783 7 controller.go:1152] Service "bmdev-ne-linker/bmdev-ne-linker-service" does not have any active Endpoint.
@matthewbrumpton its not known if the error message is repeated forever after reconciling and if the HTTP/HTTPS request are failing forever.
So is this issue about a error message during startup and before reconciling ?
@longwuyuan , error occurs during startup and before reconciling, unable to get any further as our Azure Frontdoor cannot reach the endpoint
@matthewbrumpton so from the point of view of ingress-nginx controller, do you mean to say that if your HTTP/HTTPS request fails ? If yes, then very obviously, you need to show the data like ;
If you see above, repeated requests have been made to help analyse the state at the time the problem occurs, but for whatever reason there is no way to reproduce the problem on minikube or kind cluster and there is no data like kubectl describe output of ingress, when the problem happened
@longwuyuan , recreated on minikube
helm install nginx-ingress ingress-nginx/ingress-nginx --version 4.6.1
--create-namespace --namespace bmde-ne-nginx
--set controller.replicaCount=1 --set controller.metrics.enabled=true
--set controller.nodeSelector."kubernetes.io/os"=linux --set controller.admissionWebhooks.patch.nodeSelector."kubernetes\.io/os"=linux
--set controller.service.annotations."service.beta.kubernetes.io/azure-load-balancer-internal"=true
apiVersion: apps/v1
kind: Deployment
metadata:
name: aks-helloworld-one
namespace: httpbin
spec:
replicas: 1
selector:
matchLabels:
app: aks-helloworld-one
template:
metadata:
labels:
app: aks-helloworld-one
spec:
containers:
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld-one
namespace: httpbin
spec:
type: ClusterIP
ports:
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: hello-world-ingress namespace: httpbin spec: ingressClassName: nginx rules:
hi @matthewbrumpton ty for recreating it in minikube -- did you experience the same error even using minikube?
Why using azure annotation on minikube
On Thu, 1 Jun, 2023, 8:51 pm rdb0101, @.***> wrote:
hi @matthewbrumpton https://github.com/matthewbrumpton ty for recreating it in minikube -- did you experience the same error even using minikube?
— Reply to this email directly, view it on GitHub https://github.com/kubernetes/ingress-nginx/issues/9932#issuecomment-1572257197, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGZVWRR4MULYZ6HI54JM7LXJCXN3ANCNFSM6AAAAAAXYHZWOU . You are receiving this because you were mentioned.Message ID: @.***>
@longwuyuan , same error with a deployment I use for testing
W0601 14:30:38.998226 7 controller.go:1152] Service "httpbin/httpbin" does not have any active Endpoint.
NGINX Ingress controller Release: v1.7.1 Build: f48b03be54031491e78472bcf3aa026a81e1ffd3 Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.21.6
W0601 14:28:47.702631 7 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. I0601 14:28:47.702837 7 main.go:209] "Creating API client" host="https://10.96.0.1:443" I0601 14:28:47.707842 7 main.go:253] "Running in Kubernetes cluster" major="1" minor="26" git="v1.26.3" state="clean" commit="9e644106593f3f4aa98f8a84b23db5fa378900bd" platform="linux/amd64" I0601 14:28:47.842403 7 main.go:104] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem" I0601 14:28:47.867948 7 ssl.go:533] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key" I0601 14:28:47.876005 7 nginx.go:261] "Starting NGINX Ingress controller" I0601 14:28:47.883008 7 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"bmde-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller", UID:"dca48c46-1ed7-431c-a283-cc4d9a8b5723", APIVersion:"v1", ResourceVersion:"1327", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap bmde-ne-nginx/nginx-ingress-ingress-nginx-controller I0601 14:28:49.078015 7 nginx.go:304] "Starting NGINX process" I0601 14:28:49.078190 7 leaderelection.go:248] attempting to acquire leader lease bmde-ne-nginx/nginx-ingress-ingress-nginx-leader... I0601 14:28:49.078556 7 nginx.go:324] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key" I0601 14:28:49.078875 7 controller.go:190] "Configuration changes detected, backend reload required" I0601 14:28:49.092466 7 leaderelection.go:258] successfully acquired lease bmde-ne-nginx/nginx-ingress-ingress-nginx-leader I0601 14:28:49.092562 7 status.go:84] "New leader elected" identity="nginx-ingress-ingress-nginx-controller-6679b95c85-wzf2q" I0601 14:28:49.127279 7 controller.go:207] "Backend successfully reloaded" I0601 14:28:49.127411 7 controller.go:218] "Initial sync, sleeping for 1 second" I0601 14:28:49.127472 7 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"bmde-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller-6679b95c85-wzf2q", UID:"3cc894ae-b6ae-40b4-8e06-370a03e3df9f", APIVersion:"v1", ResourceVersion:"1380", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration W0601 14:30:38.998226 7 controller.go:1152] Service "httpbin/httpbin" does not have any active Endpoint. I0601 14:30:39.030518 7 admission.go:149] processed ingress via admission controller {testedIngressLength:1 testedIngressTime:0.032s renderingIngressLength:1 renderingIngressTime:0.001s admissionTime:18.1kBs testedConfigurationSize:0.033} I0601 14:30:39.030584 7 main.go:100] "successfully validated configuration, accepting" ingress="httpbin/httpbin" I0601 14:30:39.035382 7 store.go:433] "Found valid IngressClass" ingress="httpbin/httpbin" ingressclass="nginx" I0601 14:30:39.035599 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"httpbin", Name:"httpbin", UID:"6a033dd2-7c89-40c3-bcb1-ec88db7cca86", APIVersion:"networking.k8s.io/v1", ResourceVersion:"1549", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync W0601 14:30:42.317416 7 controller.go:1152] Service "httpbin/httpbin" does not have any active Endpoint. I0601 14:30:42.317602 7 controller.go:190] "Configuration changes detected, backend reload required" I0601 14:30:42.437945 7 controller.go:207] "Backend successfully reloaded" I0601 14:30:42.438185 7 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"bmde-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller-6679b95c85-wzf2q", UID:"3cc894ae-b6ae-40b4-8e06-370a03e3df9f", APIVersion:"v1", ResourceVersion:"1380", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
Installed without Azure annotations, with same error:
I0601 17:37:38.119381 7 controller.go:207] "Backend successfully reloaded" I0601 17:37:38.119590 7 controller.go:218] "Initial sync, sleeping for 1 second" I0601 17:37:38.119733 7 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"bmde-ne-nginx", Name:"nginx-ingress-ingress-nginx-controller-6679b95c85-tzwk9", UID:"d039bd8f-5901-492d-9767-7485515bf49d", APIVersion:"v1", ResourceVersion:"12177", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration W0601 17:37:41.937126 7 controller.go:1152] Service "httpbin/httpbin" does not have any active Endpoint. W0601 17:37:46.303769 7 controller.go:1152] Service "httpbin/httpbin" does not have any active Endpoint. W0601 17:37:49.640181 7 controller.go:1152] Service "httpbin/httpbin" does not have any active Endpoint. W0601 17:37:52.974091 7 controller.go:1152] Service "httpbin/httpbin" does not have any active Endpoint.
helm install nginx-ingress ingress-nginx/ingress-nginx --version 4.6.1
--create-namespace --namespace bmde-ne-nginx
--set controller.replicaCount=1 --set controller.metrics.enabled=true
--set controller.nodeSelector."kubernetes.io/os"=linux `
--set controller.admissionWebhooks.patch.nodeSelector."kubernetes.io/os"=linux
@matthewbrumpton thanks for the data. My comments below
% k -n ingress-nginx logs ingress-nginx-controller-7bbdbc6f49-xmv9s | grep -i error | grep -i endpoint
W0601 05:46:33.584633 8 controller.go:1102] Error obtaining Endpoints for Service "default/hello-world": no object matching key "default/hello-world" in local store
W0601 05:48:02.006558 8 controller.go:1102] Error obtaining Endpoints for Service "default/hello-world": no object matching key "default/hello-world" in local store
W0601 05:49:34.333501 8 controller.go:1102] Error obtaining Endpoints for Service "default/hello-world": no object matching key "default/hello-world" in local store
W0601 05:49:37.667914 8 controller.go:1102] Error obtaining Endpoints for Service "default/hello-world": no object matching key "default/hello-world" in local store
W0601 05:49:41.000857 8 controller.go:1102] Error obtaining Endpoints for Service "default/hello-world": no object matching key "default/hello-world" in local store
[~]
% curl -I https://grafana.dev.enjoydevops.com -L
HTTP/2 302
date: Thu, 01 Jun 2023 18:18:01 GMT
content-type: text/html; charset=utf-8
cache-control: no-store
location: /login
x-content-type-options: nosniff
x-frame-options: deny
x-xss-protection: 1; mode=block
strict-transport-security: max-age=15724800; includeSubDomains
HTTP/2 200 date: Thu, 01 Jun 2023 18:18:01 GMT content-type: text/html; charset=UTF-8 cache-control: no-store x-content-type-options: nosniff x-frame-options: deny x-xss-protection: 1; mode=block strict-transport-security: max-age=15724800; includeSubDomains
- So we need a clear accurate and detailed description of the problem that has to be solved in the controller
- I already typed out the commands https://github.com/kubernetes/ingress-nginx/issues/9932#issuecomment-1572027445 , that hint the information that will help make progress here but your reports here do not contain any information that will help either describe the problem accurately or help analyse/understand the active state of the ingress resource and the ingress-controller
- It seems like there is no clarity if you are reporting a error message happening at a point in time or if you are reporting a broken ingress while the error message timestamp is matching your HTTP/HTTPS request sent to the ingress-nginx controller
@longwuyuan Will this be enough information in conjunction with the description you have mentioned? Do you think this would be considered a bug?
@rdb0101 if it is a bug, then usually a issue will be tagged as a bug, even without proper description (in the beginning).
I am not able to reproduce on minikube as I mentioned that I see the error message with a old timestamp and not with a timestamp when I access my app. That error is not logged again, after that early timestamp, where state and config was not yet reconciled, during startup of cluster
cc @strongjz for any comments
@longwuyuan So the error messages don't necessarily occur at the exact time one access the app. As mentioned before, when I am able to access my app(s) there is no error message as the ingress-controller "sees" the apps --> service's corresponding endpoint. It's when the app(s) become inaccessible that the error message appears despite having an active or valid endpoint. At any point are you unable to access your app at all?
I am able to access my app all the time and I do not get any new error message after the early pre-reconcilliation message.
And so I have asked for that clear state information, where at one given timestamp (+ or - few seconds) ;
kubectl describe ...
outputs for ingress, appservice, apppod captured immidiately when curl failedkubectl get events ...
from appnamespace and controllername have some events about failed objects and failed rquestThis info will be proof of bug.
Or write a step-by-step instruction on how anyone can reproduce the problem of failing HTTP/HTTPS request with that error message in controller logs
If you have pods restarting or network breaking intermittently, then that may also cause failed request.
I think that if controller was broken, then this will happen to every user because endpointslices is used for every single ingress, since last 3-4 releases
Hi @longwuyuan I can work on step-by-step instructions on how to reproduce the problem. The app only breaks from the frontend, on the backend the apps still work and respond as expected. Hopefully others will be able to provide some additional feedback as well.
@rdb0101 I think its important to establish if the controller has a problem because of which the routing to app fails. If your HTTP/HTTPS request to your frontend breaks, it does not directly mean that the controller is causing the problem.
So first we need the kubectl describe ingress ....
output when your HTTP/HTTPS request to frontend fails. Then we need to match the timestamp when you sent request to the timestamp of log messages related to your HTTP/HTTPS request. Then we need to check if pod had problems like networking or cpu or memory temporarily. Or some other proof that without any other problems in the cluster, only the ingress-nginx controller was the cause of broken routing and error message.
I just started facing this issue after enabling TLS on ingress. Exact same config of ingress (+ service and everything down the line) is still working flawless when TLS is disabled. Using AKS and nginx-ingress controller.
What happened: The ingress controller reported that the "Service does not have any active Endpoint" when in fact the service did have active endpoints.
I was able to verify the service was active by execing into the nginx pod and curling the health check endpoint of the service.
The only way I was able to recover was to reinstall the helm chart.
What you expected to happen:
The service to be added to ingress controller
NGINX Ingress controller version:
Kubernetes version (use
kubectl version
): Server Version: version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.6-eks-48e63af", GitCommit:"9f22d4ae876173884749c0701f01340879ab3f95", GitTreeState:"clean", BuildDate:"2023-01-24T19:19:02Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}Environment: AWS EKS
Server Version: version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.6-eks-48e63af", GitCommit:"9f22d4ae876173884749c0701f01340879ab3f95", GitTreeState:"clean", BuildDate:"2023-01-24T19:19:02Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
How was the ingress-nginx-controller installed: nginx nginx 1 2023-05-06 16:52:09.643618809 +0000 UTC deployed ingress-nginx-4.5.2 1.6.4
Values:
How to reproduce this issue: Unknown. There was a single replica of the pod, and it was deployed for 42 days before exhibiting this problem.
However, others have recently reported this issue in https://github.com/kubernetes/ingress-nginx/issues/6135.
Anything else we need to know:
The problem was previously reported in https://github.com/kubernetes/ingress-nginx/issues/6135, but the defect was closed.