k3s-io / klipper-lb

Embedded service load balancer in Klipper
Apache License 2.0
354 stars 41 forks source link

[BUG] svclb-traefik* won't start after host crash and restart. #34

Open bayeslearner opened 2 years ago

bayeslearner commented 2 years ago

What did you do

What did you expect to happen

Ingress should work

Screenshots or terminal output

[rockylinux@rockylinux8 infra_k3d]$ kubectl -n kube-system logs svclb-traefik-dkgkq lb-port-80
+ trap exit TERM INT
+ echo 10.43.70.41
+ grep -Eq :
+ cat /proc/sys/net/ipv4/ip_forward
+ '[' 1 '!=' 1 ]
+ iptables -t nat -I PREROUTING '!' -s 10.43.70.41/32 -p TCP --dport 80 -j DNAT --to 10.43.70.41:80
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
iptables v1.8.4 (legacy): can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded. 

Which OS & Architecture

Which version of k3d

Which version of docker

Server: Containers: 3 Running: 2 Paused: 0 Stopped: 1 Images: 5 Server Version: 20.10.13 Storage Driver: overlay2 Backing Filesystem: xfs Supports d_type: true Native Overlay Diff: true userxattr: false Logging Driver: json-file Cgroup Driver: cgroupfs Cgroup Version: 1 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc Default Runtime: runc Init Binary: docker-init containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc runc version: v1.0.3-0-gf46b6ba init version: de40ad0 Security Options: seccomp Profile: default Kernel Version: 4.18.0-348.20.1.el8_5.x86_64 Operating System: Rocky Linux 8.5 (Green Obsidian) OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 31.19GiB Name: rockylinux8.linuxvmimages.local ID: RI32:V7KA:PDQG:Q2Z2:DNET:CMMP:3MMG:23OF:RMTN:W6J2:WOQO:N4YA Docker Root Dir: /var/lib/docker Debug Mode: false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false

bayeslearner commented 2 years ago
Name:           svclb-traefik-wqjjt
Namespace:      kube-system
Priority:       0
Node:           <none>
Labels:         app=svclb-traefik
                controller-revision-hash=f4f897b4f
                pod-template-generation=1
                svccontroller.k3s.cattle.io/svcname=traefik
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  DaemonSet/svclb-traefik
Containers:
  lb-port-80:
    Image:      rancher/klipper-lb:v0.3.4
    Port:       80/TCP
    Host Port:  80/TCP
    Environment:
      SRC_PORT:    80
      DEST_PROTO:  TCP
      DEST_PORT:   80
      DEST_IPS:    10.43.184.59
    Mounts:        <none>
  lb-port-443:
    Image:      rancher/klipper-lb:v0.3.4
    Port:       443/TCP
    Host Port:  443/TCP
    Environment:
      SRC_PORT:    443
      DEST_PROTO:  TCP
      DEST_PORT:   443
      DEST_IPS:    10.43.184.59
    Mounts:        <none>
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:         <none>
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     CriticalAddonsOnly op=Exists
                 node-role.kubernetes.io/control-plane:NoSchedule op=Exists
                 node-role.kubernetes.io/master:NoSchedule op=Exists
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  5m44s  default-scheduler  0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
  Warning  FailedScheduling  4m32s  default-scheduler  0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
westrickc commented 2 years ago

I get the same error after creating a new cluster with k3d. My host OS is RHEL 8.5. I think it is related to the fact that RHEL 8.5 only supports nftables, but the kilpper-lb docker image has iptables symlinked to the legacy version.

Relevant versions of things:

My workaround was to recreate the rancher/klipper-lb:vb0.3.4 image with this Dockerfile:

FROM rancher/klipper-lb:v0.3.4
# Use nftables iptables not legacy
RUN \
  ln -sf /sbin/xtables-nft-multi /sbin/iptables && \
  ln -sf /sbin/xtables-nft-multi /sbin/iptables-save && \
  ln -sf /sbin/xtables-nft-multi /sbin/iptables-restore
CMD ["entry"]

Then I used k3d image import to inject this new image into the cluster. Eventually kubernetes will use the new image to restart the failed svclb-traefik-xxxxx pod.

It's a hack, but it gets ingress working on my system.

r-ushil commented 1 year ago

Check this out for a quick fix:

https://github.com/k3d-io/k3d/issues/1021#issuecomment-1559194060 #

To solve the problem properly (rather than use this ad-hoc fix), I would suggest rewriting check_iptables_mode() to use grep inside of the /sbin directory, rather than trying to use lsmod / modprobe

bartowl commented 1 year ago

It has been now over a year and this issue has still not been fixed? There is more and more nft-based systems and this is really annoying... In particular, with 0.4.3:

+ info 'legacy mode detected'
+ echo '[INFO] ' 'legacy mode detected'
+ set_legacy
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables
[INFO]  legacy mode detected
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables-save
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables-restore
+ ln -sf /sbin/xtables-legacy-multi /sbin/ip6tables
+ start_proxy
+ echo 0.0.0.0/0
+ grep -Eq :
+ iptables -t filter -I FORWARD -s 0.0.0.0/0 -p TCP --dport 80 -j ACCEPT
modprobe: can't change directory to '/lib/modules': No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
iptables v1.8.8 (legacy): can't initialize iptables table `filter': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.

This is current (5.5.1) k3d using klipper-lb:v0.4.3 on Oracle Linux Server 8.7 (RHEL 8.7 binary compatible). Host is running iptables v1.8.4 (nf_tables) with following packages installed: iptables-1.8.4-23.0.1.el8.x86_64 nftables-0.9.3-26.el8.x86_64 iptables-ebtables-1.8.4-23.0.1.el8.x86_64 python3-nftables-0.9.3-26.el8.x86_64 iptables-libs-1.8.4-23.0.1.el8.x86_64

proposed change do the detection would be to replace lsmod | grep "nf_tables" with lsmod | grep "nf_conntrack" as this is how lsmod output looks like on this system after grepping for "nf_":

#5 0.220 nf_conntrack_netlink    45056  0
#5 0.220 nf_reject_ipv4         16384  1 ipt_REJECT
#5 0.220 nf_nat                 45056  3 xt_nat,xt_MASQUERADE,nft_chain_nat
#5 0.220 nf_conntrack          147456  5 nf_conntrack_netlink,xt_nat,xt_conntrack,xt_MASQUERADE,nf_nat
#5 0.220 nf_defrag_ipv6         24576  1 nf_conntrack
#5 0.220 nf_defrag_ipv4         16384  1 nf_conntrack
#5 0.220 libcrc32c              16384  3 nf_nat,nf_conntrack,xfs