Open briananstett opened 4 weeks ago
@briananstett could you try to use a different namespace that kube-system
?
it might be due to that.
Thank you for the reply @oktalz.
I created a new namespace on my EKS cluster (version 1.30) and installed a fresh installation of HAProxy using the Helm chart and all the default values.
kubectl create namespace haproxy
kubens haproxy
helm install haproxy haproxytech/kubernetes-ingress
NAME: haproxy
LAST DEPLOYED: Wed Jun 12 08:54:46 2024
NAMESPACE: haproxy
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
HAProxy Kubernetes Ingress Controller has been successfully installed.
Controller image deployed is: "haproxytech/kubernetes-ingress:1.11.4".
Your controller is of a "Deployment" kind. Your controller service is running as a "NodePort" type.
RBAC authorization is enabled.
Controller ingress.class is set to "haproxy" so make sure to use same annotation for
Ingress resource.
Service ports mapped are:
- name: http
containerPort: 8080
protocol: TCP
- name: https
containerPort: 8443
protocol: TCP
- name: stat
containerPort: 1024
protocol: TCP
- name: quic
containerPort: 8443
protocol: UDP
Node IP can be found with:
$ kubectl --namespace haproxy get nodes -o jsonpath="{.items[0].status.addresses[1].address}"
The following ingress resource routes traffic to pods that match the following:
* service name: web
* client's Host header: webdemo.com
* path begins with /
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-ingress
namespace: default
annotations:
ingress.class: "haproxy"
spec:
rules:
- host: webdemo.com
http:
paths:
- path: /
backend:
serviceName: web
servicePort: 80
In case that you are using multi-ingress controller environment, make sure to use ingress.class annotation and match it
with helm chart option controller.ingressClass.
For more examples and up to date documentation, please visit:
* Helm chart documentation: https://github.com/haproxytech/helm-charts/tree/main/kubernetes-ingress
* Controller documentation: https://www.haproxy.com/documentation/kubernetes/latest/
* Annotation reference: https://github.com/haproxytech/kubernetes-ingress/tree/master/documentation
* Image parameters reference: https://github.com/haproxytech/kubernetes-ingress/blob/master/documentation/controller.md
The same behavior begins to happen immediately though where the pods are crashing with a 137.
kubectl get pods
NAME READY STATUS RESTARTS AGE
haproxy-kubernetes-ingress-7d8448d8b5-4rcpq 0/1 CrashLoopBackOff 1 (4s ago) 6s
haproxy-kubernetes-ingress-7d8448d8b5-g7pfw 0/1 CrashLoopBackOff 1 (5s ago) 6s
Name: haproxy-kubernetes-ingress-7d8448d8b5-4rcpq
Namespace: haproxy
Priority: 0
Service Account: haproxy-kubernetes-ingress
Node: ip-10-11-39-28.ec2.internal/10.11.39.28
Start Time: Wed, 12 Jun 2024 09:09:48 -0400
Labels: app.kubernetes.io/instance=haproxy
app.kubernetes.io/name=kubernetes-ingress
pod-template-hash=7d8448d8b5
Annotations: <none>
Status: Running
IP: 10.11.57.131
IPs:
IP: 10.11.57.131
Controlled By: ReplicaSet/haproxy-kubernetes-ingress-7d8448d8b5
Containers:
kubernetes-ingress-controller:
Container ID: containerd://f12d8ea43047c51eb4efeea433560eb5e131ae5ac1a30432ba4a07cc7efaf07d
Image: haproxytech/kubernetes-ingress:1.11.4
Image ID: docker.io/haproxytech/kubernetes-ingress@sha256:c5f8a41ef0d4b177bec10f082da578f2be69af9a54b719a76ea6ce2707f4248e
Ports: 8080/TCP, 8443/TCP, 1024/TCP, 8443/UDP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/UDP
Args:
--default-ssl-certificate=haproxy/haproxy-kubernetes-ingress-default-cert
--configmap=haproxy/haproxy-kubernetes-ingress
--http-bind-port=8080
--https-bind-port=8443
--quic-bind-port=8443
--quic-announce-port=443
--ingress.class=haproxy
--publish-service=haproxy/haproxy-kubernetes-ingress
--log=info
--prometheus
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 12 Jun 2024 09:10:14 -0400
Finished: Wed, 12 Jun 2024 09:10:14 -0400
Ready: False
Restart Count: 2
Requests:
cpu: 250m
memory: 400Mi
Liveness: http-get http://:1042/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:1042/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
Startup: http-get http://:1042/healthz delay=0s timeout=1s period=1s #success=1 #failure=20
Environment:
POD_NAME: haproxy-kubernetes-ingress-7d8448d8b5-4rcpq (v1:metadata.name)
POD_NAMESPACE: haproxy (v1:metadata.namespace)
POD_IP: (v1:status.podIP)
Mounts:
/run from tmp (rw,path="run")
/tmp from tmp (rw,path="tmp")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-csmzg (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tmp:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: 64Mi
kube-api-access-csmzg:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: eks.amazonaws.com/nodegroup=vwgoa-4vcpu-16gb-onDemand
Tolerations: OnDemand=true:PreferNoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 32s default-scheduler Successfully assigned haproxy/haproxy-kubernetes-ingress-7d8448d8b5-4rcpq to ip-10-11-39-28.ec2.internal
Normal Pulled 6s (x3 over 31s) kubelet Container image "haproxytech/kubernetes-ingress:1.11.4" already present on machine
Normal Created 6s (x3 over 31s) kubelet Created container kubernetes-ingress-controller
Normal Started 6s (x3 over 31s) kubelet Started container kubernetes-ingress-controller
Warning Unhealthy 5s (x2 over 30s) kubelet Startup probe failed: Get "http://10.11.57.131:1042/healthz": dial tcp 10.11.57.131:1042: connect: connection refused
Warning BackOff 1s (x8 over 29s) kubelet Back-off restarting failed container kubernetes-ingress-controller in pod haproxy-kubernetes-ingress-7d8448d8b5-4rcpq_haproxy(be15a544-1d21-4ed5-9207-631133bbac46)
and pod logs...
s6-overlay-suexec: warning: unable to gain root privileges (is the suid bit set?)
/package/admin/s6-overlay/libexec/preinit: info: read-only root
/package/admin/s6-overlay/libexec/preinit: info: writable /run. Checking for executability.
If it's helpful, I also created a new node with taints only allowing the HAProxy workloads to run on it. I updated my HAProxy deployment to have the appropriate Node Selectors and Tolerations to get the pods to run on that isolated node. Attached are logs from the kubelet and contanierd of that node. haproxy-containerd.log haproxy-kubelet.log
Sorry to bug you @oktalz, but did you have any other ideas? I'm now seeing this problem with all of my HAProxy workloads across multiple EKS clusters.
I'm sort of out of ideas of things to try.
Just some more information, sorry if this is irrelevant.
I noticed in a working HAProxy pod, this is the running processes.
$ ps aux
PID USER TIME COMMAND
1 haproxy 0:00 /package/admin/s6/command/s6-svscan -d4 -- /run/service
20 haproxy 0:00 s6-supervise s6-linux-init-shutdownd
22 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
29 haproxy 0:00 s6-supervise ingress-controller
30 haproxy 0:00 s6-supervise haproxy
31 haproxy 0:00 s6-supervise s6rc-fdholder
32 haproxy 0:00 s6-supervise s6rc-oneshot-runner
38 haproxy 0:00 /package/admin/s6/command/s6-ipcserverd -1 -- /package/admin/s6/command/s6-ipcserver-access -v0 -E -l0 -i data/rules -- /package/admin/s6/command/s6-sudod -t 30000 -- /package/admin/s6-rc/c
74 haproxy 7:24 /haproxy-ingress-controller --with-s6-overlay --default-ssl-certificate=kube-system/haproxy-dev-kubernetes-ingress-default-cert --configmap=kube-system/haproxy-dev-kubernetes-ingress --http
130 haproxy 0:00 /usr/local/sbin/haproxy -W -db -m 10534 -S /var/run/haproxy-master.sock,level,admin -f /etc/haproxy/haproxy.cfg -f /etc/haproxy/haproxy-aux.cfg
150 haproxy 1h27 /usr/local/sbin/haproxy -sf 137 -x sockpair@4 -W -db -m 10534 -S /var/run/haproxy-master.sock,level,admin -f /etc/haproxy/haproxy.cfg -f /etc/haproxy/haproxy-aux.cfg
2197 haproxy 0:00 /bin/sh
2204 haproxy 0:00 ps aux
But in one of my broken HAProxy pod, the running process are this:
ps aux
PID USER TIME COMMAND
1 haproxy 0:00 /package/admin/s6/command/s6-svscan -d4 -- /run/service
21 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
36 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
40 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
52 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
70 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
92 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
160 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
172 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
190 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
310 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
320 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
340 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
429 haproxy 0:00 /bin/sh
445 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
486 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
526 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
530 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
546 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
648 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
708 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
778 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
804 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
848 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
860 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
973 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
981 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1005 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1015 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1077 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1143 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1155 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1159 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1175 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1195 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1227 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1267 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1271 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1335 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1459 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1517 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1553 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1631 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1633 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1651 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1653 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1667 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1679 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1707 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1723 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1727 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1787 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1847 haproxy 0:00 /package/admin/s6-linux-init/command/s6-linux-init-shutdownd -d3 -c /run/s6/basedir -g 3000 -C -B
1929 haproxy 0:00 ps aux
And just logs
s6-overlay-suexec: warning: unable to gain root privileges (is the suid bit set?)
/package/admin/s6-overlay/libexec/preinit: info: read-only root
/package/admin/s6-overlay/libexec/preinit: info: writable /run. Checking for executability.
I'm experiencing a very strange behavior where sometimes an HAProxy Kubernetes Ingress pod fails to start and begins to crash loop. The initial kubectl describe output seems to point to an issue with the startup probe failing. The issue seems to be sporadic and requires a "fresh" new node that has never had a HAProxy Ingress Controller pod on it before to resolve the issue.
(Initial describe output)
But when I adjust the startup probe configuration to allow for more startup time, the pods still continue to crash immediately but with a 137 exit code and s6-overlay error logs.
(Altered starup probe configuration)
(kubectl describe output from after probe update)
(container logs)
I've tried different versions of the HAProxy Ingress controller, updating Kubernetes versions, updating node AMIs, altering resource allocations (trying to address the
137
exit code), removing security contexts, and more with no luck. Oddly, I'm only having this issue on one EKS clusters I'm running. The exact same installation works on a different EKS cluster running the same version and configuration.Specs