jcmoraisjr / haproxy-ingress

HAProxy Ingress
https://haproxy-ingress.github.io
Apache License 2.0
1.04k stars 269 forks source link

HAProxy fails to start #978

Closed bootc closed 1 year ago

bootc commented 1 year ago

Description of the problem

With HAProxy Ingress 0.14.0 and the external HAProxy configuration, the HAProxy container fails to start in my configuration.

[NOTICE]   (1) : haproxy version is 2.4.20-d59cd78
[ALERT]    (1) : parsing [/etc/haproxy/haproxy.cfg:137] : 'bind unix@/var/run/haproxy/_https_socket.sock' : 'crt-list' : cannot open the file '/var/lib/haproxy/crt/default-fake-certificate.pem'.
[ALERT]    (1) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT]    (1) : Fatal errors found in configuration.

It works fine if I use the combined HAProxy + HAProxy Ingress container.

In case it's relevant, I only have two Ingress resources picked up by HAProxy Ingress and both are configured with ingress.kubernetes.io/ssl-passthrough: "true".

Expected behavior

HAProxy should start when run separately from the HAProxy Ingress container.

Steps to reproduce the problem

  1. Install the chart with controller.haproxy.enabled=true.

Environment information

HAProxy Ingress version: v0.14.0

Helm values.yaml:

controller:
  extraArgs:
    ignore-ingress-without-class: "true"
    wait-before-shutdown: "30"

  config:
    bind-ip-addr-http: "[::]"
    bind-ip-addr-tcp: "[::]"
    https-log-format: default
    tcp-log-format: default

  replicaCount: 2

  resources:
    requests:
      cpu: 10m
      memory: 64Mi
    limits:
      memory: 192Mi

  service:
    type: LoadBalancer
    externalTrafficPolicy: Local
    ipFamilyPolicy: RequireDualStack

  haproxy:
    enabled: true

    resources:
      requests:
        cpu: 10m
        memory: 64Mi
      limits:
        memory: 96Mi

  stats:
    enabled: true

  metrics:
    enabled: true

  serviceMonitor:
    enabled: true

Ingress objects:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ingress.kubernetes.io/ssl-passthrough: "true"
  name: test
spec:
  ingressClassName: haproxy
  rules:
  - host: test.example.com
    http:
      paths:
      - backend:
          service:
            name: test
            port:
              name: https
        path: /
        pathType: Prefix
jcmoraisjr commented 1 year ago

Hi, sorry about the long delay. I couldn't reproduce the problem. Can you confirm if this issue are reproducible on older controller versions? What distro are you using? Can you confirm if both controller and haproxy containers mount /var/lib/haproxy, and what controller side writes can be read by the haproxy side?

joelwurtz commented 1 year ago

We got the same errors when trying to use external haproxy. Don't know about previous versions but our case, all mounted directories were empty.

Is the behavior to have shared mount directories between containers in this case ?

If that's the case that would explain it as our provider does not support sharing the same volume between containers

jcmoraisjr commented 1 year ago

Controller has only file system as the way to communicate with a haproxy instance. Running it externally means that you need to mirror fs between the containers. There are a few words in the example page about that. Double check how your deployment resource is created regarding the emptyDir configuration, or even better try to deploy two containers sharing fs via emptyDir in order to reproduce the issue. Let us know if our helm chart is creating something in a wrong way.

joelwurtz commented 1 year ago

Here is some details about the error :

Pod description:

Name:             main-haproxy-ingress-6b8cc8db4-j4bjq
Namespace:        router
Priority:         0
Service Account:  main-haproxy-ingress
Node:             preprod-gra-node-pool-node-b18ebe/10.0.1.159
Start Time:       Wed, 08 Feb 2023 11:04:59 +0100
Labels:           app.kubernetes.io/instance=main
                  app.kubernetes.io/name=haproxy-ingress
                  pod-template-hash=6b8cc8db4
Annotations:      cni.projectcalico.org/containerID: 1a11c22cea893a9fe5f422221f1151434e40de9ed27a20fd5cd5c43821aa0c70
                  cni.projectcalico.org/podIP: 10.2.0.225/32
                  cni.projectcalico.org/podIPs: 10.2.0.225/32
Status:           Running
IP:               10.2.0.225
IPs:
  IP:           10.2.0.225
Controlled By:  ReplicaSet/main-haproxy-ingress-6b8cc8db4
Init Containers:
  haproxy-ingress-init:
    Container ID:  containerd://7fdc74fca37d8f786b21f4e4beacd3b0183fd0d5c82c1592d8fe7c241d203953
    Image:         quay.io/jcmoraisjr/haproxy-ingress:v0.14.0
    Image ID:      quay.io/jcmoraisjr/haproxy-ingress@sha256:2afa7354e39b0952cea74727cb45350bb3c1ec2acd335062b54482677cda16ee
    Port:          <none>
    Host Port:     <none>
    Args:
      --init
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 08 Feb 2023 11:05:01 +0100
      Finished:     Wed, 08 Feb 2023 11:05:02 +0100
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     10m
      memory:  32Mi
    Requests:
      cpu:        10m
      memory:     32Mi
    Environment:  <none>
    Mounts:
      /etc/haproxy from etc (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v7lwr (ro)
Containers:
  haproxy-ingress:
    Container ID:  containerd://f2c410baf50494cae1e29d135f620ddb1963bad39bf7cc1ae220ae0d08a04b3b
    Image:         quay.io/jcmoraisjr/haproxy-ingress:v0.14.0
    Image ID:      quay.io/jcmoraisjr/haproxy-ingress@sha256:2afa7354e39b0952cea74727cb45350bb3c1ec2acd335062b54482677cda16ee
    Port:          10254/TCP
    Host Port:     0/TCP
    Args:
      --configmap=router/main-haproxy-ingress
      --ingress-class=haproxy
      --master-socket=/var/run/haproxy/master.sock
      --sort-backends
    State:          Running
      Started:      Wed, 08 Feb 2023 11:05:03 +0100
    Ready:          True
    Restart Count:  0
    Environment:
      POD_NAME:       main-haproxy-ingress-6b8cc8db4-j4bjq (v1:metadata.name)
      POD_NAMESPACE:  router (v1:metadata.namespace)
    Mounts:
      /etc/haproxy from etc (rw)
      /var/lib/haproxy from lib (rw)
      /var/run/haproxy from run (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v7lwr (ro)
  haproxy:
    Container ID:  containerd://d6e8bfab4c12224ab8b9efd2742e07c099bcc1404b7b1621eeb313ee0a28564b
    Image:         haproxy:2.4.20-alpine
    Image ID:      docker.io/library/haproxy@sha256:f169c8d3078c75ab08bd7836632f45042b806f3fedf87b800bf790cb38eba9f5
    Ports:         80/TCP, 443/TCP, 9101/TCP, 1936/TCP, 10253/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
    Args:
      -W
      -S
      /var/run/haproxy/master.sock,mode,600
      -f
      /etc/haproxy
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 08 Feb 2023 11:05:46 +0100
      Finished:     Wed, 08 Feb 2023 11:05:46 +0100
    Ready:          False
    Restart Count:  3
    Liveness:       http-get http://:10253/healthz delay=60s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:10253/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /etc/haproxy from etc (rw)
      /var/lib/haproxy from lib (rw)
      /var/run/haproxy from run (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-v7lwr (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  etc:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  lib:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  run:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-v7lwr:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  76s                default-scheduler  Successfully assigned router/main-haproxy-ingress-6b8cc8db4-j4bjq to preprod-gra-node-pool-node-b18ebe
  Normal   Pulling    75s                kubelet            Pulling image "quay.io/jcmoraisjr/haproxy-ingress:v0.14.0"
  Normal   Pulled     74s                kubelet            Successfully pulled image "quay.io/jcmoraisjr/haproxy-ingress:v0.14.0" in 498.060831ms
  Normal   Created    74s                kubelet            Created container haproxy-ingress-init
  Normal   Started    74s                kubelet            Started container haproxy-ingress-init
  Normal   Pulling    73s                kubelet            Pulling image "quay.io/jcmoraisjr/haproxy-ingress:v0.14.0"
  Normal   Pulled     73s                kubelet            Successfully pulled image "quay.io/jcmoraisjr/haproxy-ingress:v0.14.0" in 511.60126ms
  Normal   Created    73s                kubelet            Created container haproxy-ingress
  Normal   Started    72s                kubelet            Started container haproxy-ingress
  Normal   Pulled     71s                kubelet            Successfully pulled image "haproxy:2.4.20-alpine" in 924.066129ms
  Normal   Pulled     70s                kubelet            Successfully pulled image "haproxy:2.4.20-alpine" in 1.037779427s
  Normal   Pulling    54s (x3 over 72s)  kubelet            Pulling image "haproxy:2.4.20-alpine"
  Normal   Created    53s (x3 over 71s)  kubelet            Created container haproxy
  Normal   Started    53s (x3 over 71s)  kubelet            Started container haproxy
  Normal   Pulled     53s                kubelet            Successfully pulled image "haproxy:2.4.20-alpine" in 651.671222ms
  Warning  BackOff    52s (x5 over 69s)  kubelet            Back-off restarting failed container

Logs of the failing pod :

➜  kube git:(master) ✗ kubectl logs main-haproxy-ingress-6b8cc8db4-j4bjq -c haproxy -n router
[NOTICE]   (1) : haproxy version is 2.4.20-d59cd78
[ALERT]    (1) : parsing [/etc/haproxy/haproxy.cfg:325] : 'bind unix@/var/run/haproxy/_https_socket.sock' : 'crt-list' : cannot open the file '/var/lib/haproxy/crt/default-fake-certificate.pem'.
[ALERT]    (1) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT]    (1) : Fatal errors found in configuration.

The volumes seems to be correctly mapped, when i inspect the haproxy-ingress container

➜  kube git:(master) ✗ kubectl exec main-haproxy-ingress-6b8cc8db4-j4bjq  -i -t -n router -- sh
Defaulted container "haproxy-ingress" out of: haproxy-ingress, haproxy, haproxy-ingress-init (init)
/ # ls -l /var/lib/haproxy/c
cacerts/  crl/      crt/
/ # ls -l /var/lib/haproxy/crt
total 20
-rw-------    1 root     root          7360 Feb  8 10:05 agent-proxy_router-domain-tls.pem
-rw-------    1 root     root          2933 Feb  8 10:05 default-fake-certificate.pem
-rw-------    1 root     root          7360 Feb  8 10:05 demo_router-domain-tls.pem

It's really strange, does lib mountpoint should also be present in the init container ?

jcmoraisjr commented 1 year ago

What about the content of /var/lib/haproxy/crt/ on both haproxy-ingress and haproxy containers?

It's really strange, does lib mountpoint should also be present in the init container ?

init container just need /etc/haproxy/ mounted.

joelwurtz commented 1 year ago

Okay it's a permission problem, haproxy error does not give this information, after updating the container to keep it alive and running into it :

~/crt $ ls -la
total 28
drwxr-xr-x    2 root     root          4096 Feb  8 17:07 .
drwxrwxrwx    6 root     root          4096 Feb  8 17:08 ..
-rw-------    1 root     root          7360 Feb  8 17:07 agent-proxy_router-domain-tls.pem
-rw-------    1 root     root          2933 Feb  8 17:07 default-fake-certificate.pem
-rw-------    1 root     root          7360 Feb  8 17:07 demo_router-domain-tls.pem
~/crt $ cat default-fake-certificate.pem
cat: can't open 'default-fake-certificate.pem': Permission denied
~/crt $ whoami
haproxy

Files are created with root user but container is run with the haproxy one

jcmoraisjr commented 1 year ago

Ah sure, that makes sense. You can use a short term solution by, either, running haproxy as root (see security considerations) or start the controller as the haproxy's uid.

Changing haproxy to root:

controller:
  haproxy:
    securityContext:
      runAsUser: 0

Changing controller to haproxy's uid:

controller:
   securityContext:
     runAsUser: 99

I'll leave this issue open until we have a good long term solution, maybe adding param to choose between syncing uids or changing permission to global read.

joelwurtz commented 1 year ago

I try running the ingress controller as haproxy user (99), but the external then fail to bind to 80 and 443 port number (normal), so running it as root is better.

➜  kube git:(master) ✗ kubectl logs main-haproxy-ingress-c6468758b-khdl5 -c haproxy -n router
[NOTICE]   (1) : haproxy version is 2.4.20-d59cd78
[ALERT]    (1) : Starting proxy _front__tls: cannot bind socket (Permission denied) [0.0.0.0:443]
[ALERT]    (1) : Starting frontend _front_http: cannot bind socket (Permission denied) [0.0.0.0:80]
[ALERT]    (1) : [haproxy.main()] Some protocols failed to start their listeners! Exiting.

AFAIK we should run the haproxy container as root but use:

user haproxy
group haproxy

in the haproxy.cfg file so that it forks itself with this user after binding network sockets

(it didn't seem to be the case when i check the current configuration)

Running the external haproxy as root seems to works fine ATM but it's definitly not a best practice in term of security

joelwurtz commented 1 year ago

This is what i end up doing in order to have a "correct" setup :

    securityContext:
        runAsUser: 99

    haproxy:
        enabled: true

        securityContext:
            runAsUser: 0
            runAsGroup: 0

    config:
        use-haproxy-user: "true"

So the files are correctly mapped for the haproxy user, and external haproxy is launched as root but is running as haproxy user, and it seems to work fine

jcorley-sysdig commented 1 year ago

one solution to the port 80/443 problem that we've tested is adding this to the container:

setcap CAP_NET_BIND_SERVICE=+eip /usr/local/sbin/haproxy
jcmoraisjr commented 1 year ago

Fixed in v0.14.3, v0.13.12 and v0.12.17, closing. Let us know if you still find any permission related problem.