kube-hetzner / terraform-hcloud-kube-hetzner

Optimized and Maintenance-free Kubernetes on Hetzner Cloud in one command!
MIT License
2.4k stars 371 forks source link

[Bug]: Traefik keeps restarting all the time #1500

Closed abriginets closed 2 weeks ago

abriginets commented 1 month ago

Description

I have created a new cluster recently and deployed a couple of web services there along with TLS for each one of them. And finally I put the Hetzner LB behind CloudFlare. And that's when I started getting 525 SSL Handshake Error from CloudFlare. Investigation led me to traefik pod logs which shows that traefik is being restarted quite a lot and I can't seem to understart a true reason behind it.

kubectl describe -n traefik pod traefik-75fd586479-vsgvx

  Type     Reason     Age                   From     Message
  ----     ------     ----                  ----     -------
  Warning  Unhealthy  58m (x69 over 9h)     kubelet  Readiness probe failed: Get "http://10.42.3.14:9000/ping": dial tcp 10.42.3.14:9000: connect: connection refused
  Warning  Unhealthy  3m10s (x114 over 9h)  kubelet  Liveness probe failed: Get "http://10.42.3.14:9000/ping": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  3m8s (x301 over 9h)   kubelet  Readiness probe failed: Get "http://10.42.3.14:9000/ping": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

kubectl get pods -n traefik -o wide

NAME                       READY   STATUS    RESTARTS         AGE     IP              NODE                          NOMINATED NODE   READINESS GATES
traefik-75fd586479-cbrkt   1/1     Running   0                3m29s   10.42.102.249   k3s-worker-17440953fcddcf01   <none>           <none>
traefik-75fd586479-k5b5j   1/1     Running   0                3m29s   10.42.102.250   k3s-worker-17440953fcddcf01   <none>           <none>
traefik-75fd586479-mf4wq   1/1     Running   0                3m29s   10.42.131.7     k3s-workload-amd64-xbl        <none>           <none>
traefik-75fd586479-vsgvx   1/1     Running   37 (2m33s ago)   10h     10.42.3.14      k3s-worker-71de43e0505e4dc5   <none>           <none>

Then I execute this command in terminal again after a couple of minutes

NAME                       READY   STATUS    RESTARTS         AGE   IP           NODE                          NOMINATED NODE   READINESS GATES
traefik-75fd586479-vsgvx   1/1     Running   37 (5m39s ago)   10h   10.42.3.14   k3s-worker-71de43e0505e4dc5   <none>           <none>

For some reason, traefik jumps from node to node.

Then I looked into traefik logs and here what it shows

kubectl logs -n traefik traefik-75fd586479-vsgvx --previous

Traefik logs ``` kubectl logs -n traefik traefik-75fd586479-vsgvx --previous 2024-10-11T16:30:57Z INF Traefik version 3.1.5 built on 2024-10-02T12:49:07Z version=3.1.5 2024-10-11T16:30:57Z INF Stats collection is disabled. Help us improve Traefik by turning this feature on :) More details on: https://doc.traefik.io/traefik/contributing/data-collection/ 2024-10-11T16:30:57Z INF Enabling ProxyProtocol for trusted IPs [127.0.0.1/32 10.0.0.0/8] entryPointName=web 2024-10-11T16:30:57Z INF Enabling ProxyProtocol for trusted IPs [127.0.0.1/32 10.0.0.0/8] entryPointName=websecure 2024-10-11T16:30:57Z INF Starting provider aggregator aggregator.ProviderAggregator 2024-10-11T16:30:57Z INF Starting provider *traefik.Provider 2024-10-11T16:30:57Z INF Starting provider *ingress.Provider 2024-10-11T16:30:57Z INF ingress label selector is: "" providerName=kubernetes 2024-10-11T16:30:57Z INF Creating in-cluster Provider client providerName=kubernetes 2024-10-11T16:30:57Z INF Starting provider *crd.Provider 2024-10-11T16:30:57Z INF label selector is: "" providerName=kubernetescrd 2024-10-11T16:30:57Z INF Creating in-cluster Provider client providerName=kubernetescrd 2024-10-11T16:30:57Z INF Starting provider *acme.ChallengeTLSALPN 2024-10-11T16:31:09Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:41942: read: connection reset by peer" 2024-10-11T16:31:13Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:42482: read: connection reset by peer" 2024-10-11T16:31:17Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:2956: read: connection reset by peer" 2024-10-11T16:31:20Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:35264: read: connection reset by peer" 2024-10-11T16:31:22Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:6725: read: connection reset by peer" 2024-10-11T16:31:26Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:11449: read: connection reset by peer" 2024-10-11T16:31:28Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:60139: read: connection reset by peer" 2024-10-11T16:31:30Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:26146: read: connection reset by peer" 2024-10-11T16:31:48Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:64538: read: connection reset by peer" 2024-10-11T16:31:50Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:62160: read: connection reset by peer" 2024-10-11T16:31:54Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:59993: read: connection reset by peer" 2024-10-11T16:31:58Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:22806: read: connection reset by peer" 2024-10-11T16:32:03Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:19873: read: connection reset by peer" 2024-10-11T16:32:17Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:40547: read: connection reset by peer" 2024-10-11T16:32:43Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:38871: read: connection reset by peer" 2024-10-11T16:32:50Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:45781: read: connection reset by peer" 2024-10-11T16:32:52Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:12022: read: connection reset by peer" 2024-10-11T16:32:54Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:17638: read: connection reset by peer" 2024-10-11T16:33:07Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:1189: read: connection reset by peer" 2024-10-11T16:33:24Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:43642: read: connection reset by peer" 2024-10-11T16:33:39Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:54101: read: connection reset by peer" 2024-10-11T16:33:41Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:60014: read: connection reset by peer" 2024-10-11T16:33:47Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:21735: read: connection reset by peer" 2024-10-11T16:34:18Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:4936: read: connection reset by peer" 2024-10-11T16:34:35Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:17052: read: connection reset by peer" 2024-10-11T16:34:47Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:9700: read: connection reset by peer" 2024-10-11T16:34:54Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:43955: read: connection reset by peer" 2024-10-11T16:35:11Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:56010: read: connection reset by peer" 2024-10-11T16:35:17Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:46808: read: connection reset by peer" 2024-10-11T16:35:24Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:56341: read: connection reset by peer" 2024-10-11T16:35:28Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:37627: read: connection reset by peer" 2024-10-11T16:35:35Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:23782: read: connection reset by peer" 2024-10-11T16:35:45Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:58842: read: connection reset by peer" 2024-10-11T16:35:52Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:29932: read: connection reset by peer" 2024-10-11T16:35:56Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:21478: read: connection reset by peer" 2024-10-11T16:36:00Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:18906: read: connection reset by peer" 2024-10-11T16:36:04Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:56715: read: connection reset by peer" 2024-10-11T16:36:05Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:41407: read: connection reset by peer" 2024-10-11T16:36:11Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:2593: read: connection reset by peer" 2024-10-11T16:36:13Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:48587: read: connection reset by peer" 2024-10-11T16:36:19Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:12493: read: connection reset by peer" 2024-10-11T16:36:20Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:55890: read: connection reset by peer" 2024-10-11T16:36:22Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:4663: read: connection reset by peer" 2024-10-11T16:36:24Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:59695: read: connection reset by peer" 2024-10-11T16:36:26Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:23891: read: connection reset by peer" 2024-10-11T16:36:28Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:51389: read: connection reset by peer" 2024-10-11T16:36:30Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:4682: read: connection reset by peer" 2024-10-11T16:36:32Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:54670: read: connection reset by peer" 2024-10-11T16:36:34Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:49066: read: connection reset by peer" 2024-10-11T16:36:35Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:31648: read: connection reset by peer" 2024-10-11T16:36:37Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:42948: read: connection reset by peer" 2024-10-11T16:36:39Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:3393: read: connection reset by peer" 2024-10-11T16:36:41Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:29542: read: connection reset by peer" 2024-10-11T16:36:43Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:47006: read: connection reset by peer" 2024-10-11T16:36:45Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:59415: read: connection reset by peer" 2024-10-11T16:36:47Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:46032: read: connection reset by peer" 2024-10-11T16:36:49Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:10751: read: connection reset by peer" 2024-10-11T16:36:50Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:8004: read: connection reset by peer" 2024-10-11T16:36:52Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:1877: read: connection reset by peer" 2024-10-11T16:36:54Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:57597: read: connection reset by peer" 2024-10-11T16:36:56Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:17810: read: connection reset by peer" 2024-10-11T16:36:58Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:60498: read: connection reset by peer" 2024-10-11T16:37:00Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:49148: read: connection reset by peer" 2024-10-11T16:37:02Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:30491: read: connection reset by peer" 2024-10-11T16:37:04Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:9393: read: connection reset by peer" 2024-10-11T16:37:05Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:13296: read: connection reset by peer" 2024-10-11T16:37:07Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:53422: read: connection reset by peer" 2024-10-11T16:37:09Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:37960: read: connection reset by peer" 2024-10-11T16:37:11Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:17739: read: connection reset by peer" 2024-10-11T16:37:13Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:11345: read: connection reset by peer" 2024-10-11T16:37:15Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:30884: read: connection reset by peer" 2024-10-11T16:37:17Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:43100: read: connection reset by peer" 2024-10-11T16:37:19Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:57897: read: connection reset by peer" 2024-10-11T16:37:20Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:17745: read: connection reset by peer" 2024-10-11T16:37:22Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:1079: read: connection reset by peer" 2024-10-11T16:37:24Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:11605: read: connection reset by peer" 2024-10-11T16:37:26Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:22391: read: connection reset by peer" 2024-10-11T16:37:28Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:30293: read: connection reset by peer" 2024-10-11T16:37:30Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:13250: read: connection reset by peer" 2024-10-11T16:37:32Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:7965: read: connection reset by peer" 2024-10-11T16:37:34Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:54181: read: connection reset by peer" 2024-10-11T16:37:35Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:16583: read: connection reset by peer" 2024-10-11T16:37:37Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:53389: read: connection reset by peer" 2024-10-11T16:37:39Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:30789: read: connection reset by peer" 2024-10-11T16:37:41Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:54004: read: connection reset by peer" 2024-10-11T16:37:43Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:33763: read: connection reset by peer" 2024-10-11T16:37:45Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:40997: read: connection reset by peer" 2024-10-11T16:37:47Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:64525: read: connection reset by peer" 2024-10-11T16:37:49Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:36071: read: connection reset by peer" 2024-10-11T16:37:50Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:12668: read: connection reset by peer" 2024-10-11T16:37:52Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:10462: read: connection reset by peer" 2024-10-11T16:37:54Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:18195: read: connection reset by peer" 2024-10-11T16:37:56Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:29207: read: connection reset by peer" 2024-10-11T16:37:58Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:19648: read: connection reset by peer" 2024-10-11T16:38:00Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:54209: read: connection reset by peer" 2024-10-11T16:38:02Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:29710: read: connection reset by peer" 2024-10-11T16:38:04Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:51228: read: connection reset by peer" 2024-10-11T16:38:05Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:44994: read: connection reset by peer" 2024-10-11T16:38:07Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:51831: read: connection reset by peer" 2024-10-11T16:38:09Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:54868: read: connection reset by peer" 2024-10-11T16:38:11Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:15643: read: connection reset by peer" 2024-10-11T16:38:13Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:4027: read: connection reset by peer" 2024-10-11T16:38:15Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:44883: read: connection reset by peer" 2024-10-11T16:38:17Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:64669: read: connection reset by peer" 2024-10-11T16:38:19Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:46981: read: connection reset by peer" 2024-10-11T16:38:20Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:44315: read: connection reset by peer" 2024-10-11T16:38:22Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:9132: read: connection reset by peer" 2024-10-11T16:38:24Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:31548: read: connection reset by peer" 2024-10-11T16:38:26Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:37932: read: connection reset by peer" 2024-10-11T16:38:28Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:10387: read: connection reset by peer" 2024-10-11T16:38:30Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:39554: read: connection reset by peer" 2024-10-11T16:38:32Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:14257: read: connection reset by peer" 2024-10-11T16:38:34Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:40483: read: connection reset by peer" 2024-10-11T16:38:35Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:30745: read: connection reset by peer" 2024-10-11T16:38:37Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:41608: read: connection reset by peer" 2024-10-11T16:38:39Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:64420: read: connection reset by peer" 2024-10-11T16:38:41Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:32593: read: connection reset by peer" 2024-10-11T16:38:43Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:53297: read: connection reset by peer" 2024-10-11T16:38:45Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:51223: read: connection reset by peer" 2024-10-11T16:38:47Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:1576: read: connection reset by peer" 2024-10-11T16:38:49Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:23271: read: connection reset by peer" 2024-10-11T16:38:50Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:18630: read: connection reset by peer" 2024-10-11T16:38:52Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:58694: read: connection reset by peer" 2024-10-11T16:38:54Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:35423: read: connection reset by peer" 2024-10-11T16:38:56Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:59727: read: connection reset by peer" 2024-10-11T16:38:58Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:55791: read: connection reset by peer" 2024-10-11T16:39:00Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:39101: read: connection reset by peer" 2024-10-11T16:39:02Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:23349: read: connection reset by peer" 2024-10-11T16:39:04Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:56860: read: connection reset by peer" 2024-10-11T16:39:05Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:59433: read: connection reset by peer" 2024-10-11T16:39:07Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:54419: read: connection reset by peer" 2024-10-11T16:39:09Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:3194: read: connection reset by peer" 2024-10-11T16:39:11Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:47675: read: connection reset by peer" 2024-10-11T16:39:13Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:21305: read: connection reset by peer" 2024-10-11T16:39:15Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:46435: read: connection reset by peer" 2024-10-11T16:39:17Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.130.0:26238: read: connection reset by peer" 2024-10-11T16:39:19Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:65219: read: connection reset by peer" 2024-10-11T16:39:20Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:63080: read: connection reset by peer" 2024-10-11T16:39:22Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:21911: read: connection reset by peer" 2024-10-11T16:39:24Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.130.0:14581: read: connection reset by peer" 2024-10-11T16:39:26Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:16629: read: connection reset by peer" 2024-10-11T16:39:28Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:47622: read: connection reset by peer" 2024-10-11T16:39:30Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:16943: read: connection reset by peer" 2024-10-11T16:39:34Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:18603: read: connection reset by peer" 2024-10-11T16:39:35Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:24521: read: connection reset by peer" 2024-10-11T16:39:37Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:33371: read: connection reset by peer" 2024-10-11T16:39:38Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:14758: read: connection reset by peer" 2024-10-11T16:39:40Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:37225: read: connection reset by peer" 2024-10-11T16:39:43Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:19001: read: connection reset by peer" 2024-10-11T16:39:45Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:18821: read: connection reset by peer" 2024-10-11T16:39:48Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:25361: read: connection reset by peer" 2024-10-11T16:39:50Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:48834: read: connection reset by peer" 2024-10-11T16:39:53Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:17658: read: connection reset by peer" 2024-10-11T16:39:55Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:28881: read: connection reset by peer" 2024-10-11T16:39:58Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:31312: read: connection reset by peer" 2024-10-11T16:40:00Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:21971: read: connection reset by peer" 2024-10-11T16:40:03Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:44184: read: connection reset by peer" 2024-10-11T16:40:05Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:15407: read: connection reset by peer" 2024-10-11T16:40:08Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.131.0:52434: read: connection reset by peer" 2024-10-11T16:40:10Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.3.1:25737: read: connection reset by peer" 2024-10-11T16:40:13Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8000->10.42.102.0:29679: read: connection reset by peer" 2024-10-11T16:40:14Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:46834: read: connection reset by peer" 2024-10-11T16:40:16Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:8303: read: connection reset by peer" 2024-10-11T16:40:19Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:46922: read: connection reset by peer" 2024-10-11T16:40:21Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:63431: read: connection reset by peer" 2024-10-11T16:40:24Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:36068: read: connection reset by peer" 2024-10-11T16:40:26Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:55451: read: connection reset by peer" 2024-10-11T16:40:29Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:57147: read: connection reset by peer" 2024-10-11T16:40:31Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:37188: read: connection reset by peer" 2024-10-11T16:40:34Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:32515: read: connection reset by peer" 2024-10-11T16:40:36Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:48865: read: connection reset by peer" 2024-10-11T16:40:39Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:26165: read: connection reset by peer" 2024-10-11T16:40:41Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:10806: read: connection reset by peer" 2024-10-11T16:40:44Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:49117: read: connection reset by peer" 2024-10-11T16:40:46Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:26946: read: connection reset by peer" 2024-10-11T16:40:49Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.102.0:17212: read: connection reset by peer" 2024-10-11T16:41:04Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.3.1:65230: read: connection reset by peer" 2024-10-11T16:41:17Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:8443->10.42.131.0:18003: read: connection reset by peer" 2024-10-11T16:41:22Z INF I have to go... 2024-10-11T16:41:52Z INF Stopping server gracefully ```

It just shuts down gracefuly and that's it. So I kept monitoring it and sometimes I get additional errors use of closed network connection:

2024-10-11T15:57:29Z INF I have to go...
2024-10-11T15:57:35Z INF Stopping server gracefully
2024-10-11T15:57:36Z ERR error="accept tcp [::]:9100: use of closed network connection" entryPointName=metrics
2024-10-11T15:57:42Z ERR error="accept tcp [::]:9000: use of closed network connection" entryPointName=traefik
2024-10-11T15:57:42Z ERR error="accept tcp [::]:8443: use of closed network connection" entryPointName=websecure
2024-10-11T15:57:48Z ERR Error while Peeking first byte error="read tcp 10.42.3.14:9000->10.42.3.1:50102: use of closed network connection"
2024-10-11T15:57:42Z ERR error="accept tcp [::]:8000: use of closed network connection" entryPointName=web
2024-10-11T15:58:16Z INF Shutting down
2024-10-11T15:58:16Z INF Server stopped

I thought maybe this could happen because I have 0 agent_nodepools nodes, only autoscaled ones, where I already deployed high traffic web apps that could easily consume a lof of CPU and RAM. I've added a new node to the agent just to make sure, and traefik pod did occupy that spare node, yet restarted again.

image

I'm completely out of ideas at this point. Traefik is not crashing, it just keeps restarting. Traefik docs says that the stop signal comes from above.

Kube.tf file

locals {
  hcloud_token = "xxxxxxxxxxx"
}

module "kube-hetzner" {
  providers = {
    hcloud = hcloud
  }
  hcloud_token = var.hcloud_token != "" ? var.hcloud_token : local.hcloud_token
  source = "kube-hetzner/kube-hetzner/hcloud"
  ssh_public_key = file("~/.ssh/id_ed25519.pub")
  ssh_private_key = file("~/.ssh/id_ed25519")
  network_region = "eu-central" 
  control_plane_nodepools = [
    {
      name        = "control-plane-fsn1",
      server_type = "cax11",
      location    = "fsn1",
      labels      = ["k8s"],
      taints      = [],
      count       = 1,
      swap_size   = "2G"
    },
    {
      name        = "control-plane-nbg1",
      server_type = "cax11",
      location    = "nbg1",
      labels      = ["k8s"],
      taints      = [],
      count       = 1,
      swap_size   = "2G"
    },
    {
      name        = "control-plane-hel1",
      server_type = "cax11",
      location    = "hel1",
      labels      = ["k8s"],
      taints      = [],
      count       = 1,
      swap_size   = "2G"
    }
  ]

  agent_nodepools = [
    {
      name        = "workload-amd64"
      server_type = "cpx11"
      location    = "fsn1"
      labels      = []
      taints      = []
      kubelet_args = [
        "kube-reserved=cpu=50m,memory=300Mi,ephemeral-storage=1Gi",
        "system-reserved=cpu=250m,memory=300Mi"
      ]
      count     = 1
      swap_size = "2G"
    }
  ]
  load_balancer_type     = "lb11"
  load_balancer_location = "fsn1"
  autoscaler_nodepools = [
    {
      name        = "worker"
      server_type = "cpx21"
      location    = "fsn1"
      min_nodes   = 1
      max_nodes   = 10
      taints      = []
      kubelet_args = [
        "kube-reserved=cpu=250m,memory=1000Mi,ephemeral-storage=1Gi",
        "system-reserved=cpu=250m,memory=200Mi"
      ]
    }
  ]
  system_upgrade_use_drain = true
  extra_firewall_rules = [
    {
      description     = "Allow Outbound TCP MySQL Requests"
      direction       = "out"
      protocol        = "tcp"
      port            = "29867"
      source_ips      = []
      destination_ips = ["78.46.253.145"]
    },
    {
      description     = "Allow Outbound UDP MySQL Requests"
      direction       = "out"
      protocol        = "udp"
      port            = "29867"
      source_ips      = []
      destination_ips = ["78.46.253.145"]
    },
    {
      description     = "Allow Outbound TCP Redis Requests"
      direction       = "out"
      protocol        = "tcp"
      port            = "19321"
      source_ips      = []
      destination_ips = ["91.107.211.207"]
    },
    {
      description     = "Allow Outbound UDP Redis Requests"
      direction       = "out"
      protocol        = "udp"
      port            = "19321"
      source_ips      = []
      destination_ips = ["91.107.211.207"]
    }
  ]
  dns_servers = [
    "1.1.1.1",
    "8.8.8.8",
    "2606:4700:4700::1111",
  ]
}

provider "hcloud" {
  token = var.hcloud_token != "" ? var.hcloud_token : local.hcloud_token
}

terraform {
  required_version = ">= 1.5.0"
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = ">= 1.43.0"
    }
  }
}

output "kubeconfig" {
  value     = module.kube-hetzner.kubeconfig
  sensitive = true
}

variable "hcloud_token" {
  sensitive = true
  default   = ""
}

Screenshots

No response

Platform

Linux

abriginets commented 1 month ago

It appears that setting ingress_replica_count to 3 solves the issue. From the comment above that option

The default 0 means autoselecting based on number of agent nodes (1 node = 1 replica, 2 nodes = 2 replicas, 3+ nodes = 3 replicas)

And I had 0 agent nodes, but 2-4 autoscaled nodes. Probably a single pod with traefik couldn't handle the amount of traffic apps on my cluster were getting, though kubectl top showed that it's nearly idle. I tried using Lens to see live CPU load, but it just couldn't get the metrics.

I am going to leave this issue open until it's confirmed to be a misconfiguration on my side and not an actual bug.

mysticaltech commented 2 weeks ago

@abriginets Could have been the case indeed. The pod was not healthy or ready. Happy it's working now!