High load due to ksoftirqd, growing iptables rules

wursterje commented 3 years ago

Environmental Info: K3s Version: k3s version v1.20.4+k3s1 (838a906a) go version go1.15.8

Node(s) CPU architecture, OS, and Version: Linux 4.19.0-14-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux

Cluster Configuration: 1 master, 2 workers

Describe the bug: After some time we get high loads one the machine due to high soft irqs:

Screenshot from 2021-03-25 09 11 41

Output of perf report:

Screenshot from 2021-03-25 09 14 33

Something goes wrong with the iptables rules:

iptables -L produces 7.0 MB of rules (increasing more and more over time):

Screenshot from 2021-03-25 09 18 01

Steps To Reproduce:

Installed K3s: We are using the embed etcd.

brandond commented 3 years ago

Can you attach an actual listing of the iptables rules? It's hard to troubleshoot via a screenshot. Since it's 70+mb, compressing the file before attaching it may be useful.

Are you running anything else on this node that manages iptables rules? kube-proxy and flannel should be the only thing touching the rules; I suspect something is interfering with their ability to sync rules so they keep creating new ones.

wursterje commented 3 years ago

@brandond

Can you attach an actual listing of the iptables rules? It's hard to troubleshoot via a screenshot. Since it's 70+mb, compressing the file before attaching it may be useful.

Here is the file: iptables.log.gz

Are you running anything else on this node that manages iptables rules? kube-proxy and flannel should be the only thing touching the rules; I suspect something is interfering with their ability to sync rules so they keep creating new ones.

fail2ban is installed.

brandond commented 3 years ago

Hmm this appears to be 7mb, not 70mb but still - there are a lot of duplicates in the KUBE-ROUTER-INPUT table. This comes from the network policy controller, but I can't see anything on the code side that would cause this to occur.

Can you try disabling fail2ban (ensuring that it does not start again on startup) and restart the node? If the duplicate entries don't come back without fail2ban running then I am guessing that it is doing something to the ruleset that's causing duplicate rules to be created.

wursterje commented 3 years ago

@brandond Hmm in my initial comment I've mentioned 7 dot 0 MB. Sorry for the misunderstanding...

I've disabled fail2ban but the duplicates rules still increasing over time. We've this issue on all machines running k3s version v1.20.4+k3s1 but not on v1.19.8+k3s1. All machines are configured identically.

Here is a "iptables -L | wc -l" stat:

#1 Cluster 563 v1.20.4+k3s1 561 v1.20.4+k3s1 620 v1.20.4+k3s1 18 Pods

#2 Cluster 74 v1.19.8+k3s1 85 v1.19.8+k3s1 4526 v1.20.4+k3s1 59 Pods

#3 Cluster 1235 v1.20.4+k3s1 1235 v1.20.4+k3s1 1252 v1.20.4+k3s1 87 Pods

#4 Cluster 2617 v1.20.4+k3s1 67 v1.19.8+k3s1 2613 v1.20.4+k3s1 58 Pods

brandond commented 3 years ago

The code ensures that the cluster IP and node port rules are the first three in that chain; I'm not really sure how that could go awry unless something else is manipulating the rules. What Debian release are you running on these nodes? What does iptables --version show?

https://github.com/k3s-io/k3s/blob/355fff3017b06cde44dbd879408a3a6826fa7125/pkg/agent/netpol/network_policy_controller.go#L317-L338

wursterje commented 3 years ago

Debian Buster iptables v1.8.2 (nf_tables)

I've used the code snippet above for a little test program. The problem is that

iptablesCmdHandler.Exists("filter", chain, ruleSpec...)

always returns false. Maybe related to this issue: https://github.com/coreos/go-iptables/issues/79

The workaround for us is to periodically flush the iptable rules.

ensureRuleAtPosition_test.go.gz

clrxbl commented 3 years ago

I've been having the same issue where there's tons of duplicate iptables rules being created. I've had servers with up to 40.000 iptables rules created. Disabling the network policy controller (since I use Cilium as CNI, this isn't necessary for me) fixes it.

All of my nodes are running v1.20.4+k3s1

brandond commented 3 years ago

@clrxbl what os distribution and iptables version?

clrxbl commented 3 years ago

@clrxbl what os distribution and iptables version?

This node has 13549 iptables rules, the majority of them in the KUBE-ROUTER-INPUT chain.

iptables -V
iptables v1.8.2 (nf_tables)

uname -r
4.19.0-13-amd64

cat /etc/debian_version
10.7

All of my nodes run the same software versions.

clrxbl commented 3 years ago

Would also like to say that I'm getting the exact same duplicate iptables rules created aswell. It's all just the following rules repeated over and over again:

RETURN     udp  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL multiport dports 30000:32767 /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */
RETURN     tcp  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL multiport dports 30000:32767 /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */

brandond commented 3 years ago

Interesting, debian nftables seems to be the commonality then. I think that go-iptables issue is probably what we're running into.

Disabling the network policy controller should be an acceptable workaround, assuming you don't need policy enforcement.

brandond commented 3 years ago

I have been able to duplicate this on Debian Buster. There appears to be a bug in Debian's nftables package that prevents it from properly checking iptables rules; it seems to reorder the modules so that they cannot be checked for in the order originally input:

root@debian10:~# /usr/sbin/iptables -t filter -I KUBE-ROUTER-INPUT 2 -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
root@debian10:~# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT   -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
iptables: Bad rule (does a matching rule exist in that chain?).
root@debian10:~# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT   -p tcp -m addrtype --dst-type LOCAL -m multiport --dports 30000:32767 -m comment --comment "allow LOCAL TCP traffic to node ports" -j RETURN

This works properly after running update-alternatives --set iptables /usr/sbin/iptables-legacy:

root@debian10:~# /usr/sbin/iptables -t filter -I KUBE-ROUTER-INPUT 2 -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
root@debian10:~# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT   -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
root@debian10:~# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT   -p tcp -m addrtype --dst-type LOCAL -m multiport --dports 30000:32767 -m comment --comment "allow LOCAL TCP traffic to node ports" -j RETURN
iptables: Bad rule (does a matching rule exist in that chain?).

Since this appears to be a bug in the ~kernel~ iptables-nft code, I don't think either K3s or go-iptables can fix this - iptables on Debian should be put in legacy mode until this is resolved upstream.

clrxbl commented 3 years ago

In that case I do think there should be some sort of warning placed during K3s installation when iptables is pointing to the Debian nftables backend until it's resolved.

brandond commented 3 years ago

Just validated that this works properly on Ubuntu 20.10:

root@seago:~# /usr/sbin/iptables -t filter -N KUBE-ROUTER-INPUT
root@seago:~# /usr/sbin/iptables -t filter -A KUBE-ROUTER-INPUT -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
root@seago:~# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
RETURN  tcp opt -- in * out *  0.0.0.0/0  -> 0.0.0.0/0   ADDRTYPE match dst-type LOCAL /* allow LOCAL TCP traffic to node ports */ multiport dports 30000:32767
root@seago:~# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT -p tcp -m addrtype --dst-type LOCAL -m multiport --dports 30000:32767 -m comment --comment "allow LOCAL TCP traffic to node ports" -j RETURN
iptables: Bad rule (does a matching rule exist in that chain?).
root@seago:~# iptables -V
iptables v1.8.5 (nf_tables)
root@seago:~# uname -a
Linux seago.lan.khaus 5.11.8-051108-generic #202103200636 SMP Sat Mar 20 11:17:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
root@seago:~# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.10
DISTRIB_CODENAME=groovy
DISTRIB_DESCRIPTION="Ubuntu 20.10"

brandond commented 3 years ago

@clrxbl actually it looks like it's not even a kernel thing - it's just a bug in the version of the nftables package that Debain is shipping. If you apt remove iptables nftables -y and reboot the node, K3s will use its packaged version of the iptables/nftables tools which work properly:

root@debian10:~# export PATH="/var/lib/rancher/k3s/data/current/bin/:/var/lib/rancher/k3s/data/current/bin/aux:$PATH"
root@debian10:~# which iptables
/var/lib/rancher/k3s/data/current/bin/aux/iptables
root@debian10:~# iptables -V
iptables v1.8.5 (nf_tables)
root@debian10:~# iptables -vnL KUBE-ROUTER-INPUT
# Warning: iptables-legacy tables present, use iptables-legacy to see them
Chain KUBE-ROUTER-INPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all  --  *      *       0.0.0.0/0            10.43.0.0/16         /* allow traffic to cluster IP - M66LPN4N3KB5HTJR */
    0     0 RETURN     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* allow LOCAL TCP traffic to node ports - LR7XO7NXDBGQJD2M */ ADDRTYPE match dst-type LOCAL multiport dports 30000:32767
    0     0 RETURN     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* allow LOCAL UDP traffic to node ports - 76UCBPIZNGJNWNUZ */ ADDRTYPE match dst-type LOCAL multiport dports 30000:32767
root@debian10:~# uname -a
Linux debian10 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux
root@debian10:~#

wursterje commented 3 years ago

Putting iptables in legacy mode does not resolve the under laying issue with nftables for us.

Rules are apparently not duplicated...

iptables -L | wc -l 59

... but the output of nft tells us something different:

nft list table ip filter | wc -l 5858

dyipon commented 3 years ago

Had the same issue under Debian 10. Switched to legacy iptables, but did not help. I had to reinstall the k3s cluster with calico, and now works fine.

brandond commented 3 years ago

You might try uninstalling the debian iptables/nftables packages, rather than just switching to legacy mode.

dodwyer commented 3 years ago

@brandond thanks for investigating the issue. Do you have a link to more information on the nftables bug? Ideally we can push for this to be patched so this triage is not needed

brandond commented 3 years ago

I haven't gotten as far as tracking it down to a specific commit in the upstream packages that fixed it, I just know that iptables v1.8.2 (nf_tables) from Debian has the incorrect behavior while iptables v1.8.5 (nf_tables) that we ship, and that is currently available on Ubuntu >= 20.04, behaves correctly.

Thanzex commented 3 years ago

Hi! I'm having this exact problem with k3s 1.20.4 and Debian 10. Regular spikes of iptables processes, soft-irqs occasionally spike up to 100% cpu and constantly keep my cpu usage above 75%. I'm using some fail2ban rules but as pointed out in this thread it doesn't seem to be the issue. What are some practical steps i can take to solve this? You pointed out some possible solutions but I could not understand how to actually implement them.

clrxbl commented 3 years ago

Hi! I'm having this exact problem with k3s 1.20.4 and Debian 10. Regular spikes of iptables processes, soft-irqs occasionally spike up to 100% cpu and constantly keep my cpu usage above 75%. I'm using some fail2ban rules but as pointed out in this thread it doesn't seem to be the issue. What are some practical steps i can take to solve this? You pointed out some possible solutions but I could not understand how to actually implement them.

If you do not plan on using Kubernetes NetworkPolicy or are planning/are using something other than Flannel and your network plugin handles networkpolicies, then disable the network policy controller by adding --disable-network-policy to your K3s control plane servers. (either modify your K3s config or systemd service)

brandond commented 3 years ago

uninstall the debian iptables/nftables packages

Although obviously this won't work if you're using failtoban and need the userspace tools. Ubuntu's iptables/nftables packages seem to be more up-to-date and don't suffer from this issue, so switching to Ubuntu is also a possible fix if that's an option for you.

dereknola commented 3 years ago

Update to this, I have validated that this is not an issue on CentOS8, which ships with iptables v1.8.4.

[root@centosdev ~]# /usr/sbin/iptables -t filter -I KUBE-ROUTER-INPUT 2 -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
[root@centosdev ~]# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT   -p tcp -m addrtype --dst-type LOCAL -m comment --comment "allow LOCAL TCP traffic to node ports" -m multiport --dports 30000:32767 -j RETURN
[root@centosdev ~]# /usr/sbin/iptables -t filter -C KUBE-ROUTER-INPUT   -p tcp -m addrtype --dst-type LOCAL -m multiport --dports 30000:32767 -m comment --comment "allow LOCAL TCP traffic to node ports" -j RETURN
iptables: Bad rule (does a matching rule exist in that chain?).
[root@centosdev ~]# iptables -V
iptables v1.8.4 (nf_tables)
[root@centosdev ~]# uname -a
Linux centosdev 4.18.0-305.3.1.el8.x86_64 #1 SMP Tue Jun 1 16:14:33 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

MrPowerGamerBR commented 3 years ago

Can confirm that this issue still exists and it causes a lot of issues in your k3s cluster! (Slow network requests, high CPU usage, etc)

Thanks brandond for the workaround :), there should be an warning or something about this in k3s' documentation about this.

MichaelLhommeOFP commented 3 years ago

Can confirm this issue too on Debian buster.

After removing the Debian iptables package I'm happy to see the CPU load has dropped significantly (20% for k3s, no load for iptables), but I lost access to my ingresses, any thought about this @brandond ?

The traefik load balancer seems ok to my untrained eyes..

$> kubectl get -A services | grep LoadBalancer
kube-system                   traefik                                              LoadBalancer   10.43.249.149   PUBLIC_IP   80:32222/TCP,443:30708/TCP     46d

P.S: I will happily try to debug this but I don't know where to start

MichaelLhommeOFP commented 3 years ago

Well never mind : turns out I had customized the traefik configuration in /var/lib/rancher/k3s/server/manifests/traefik-config.yaml to allow routing to containers started with docker, which was removed when I uninstalled iptables.

I lost my self managed containers.., but the workaround is OK regarding perfs :

brandond commented 3 years ago

@MichaelLhommeOFP you should be able to use the iptables in /var/lib/rancher/k3s/data/current/bin/aux/ to manually add whatever rules you need, although obviously the system-level iptables init scripts and such will not be available. Not having the system iptables available may also be breaking self-managed Docker containers. Time to move those standalone Docker containers into K3s?

unixfox commented 3 years ago

Has anyone ever tried if this bug still exist on Debian 11?

MichaelLhommeOFP commented 3 years ago

I'm currently reinstalling using Bullseye, will keep you updated. iptables should be fixed :

$> sudo iptables -v
iptables v1.8.7 (nf_tables): no command specified

MichaelLhommeOFP commented 3 years ago

@MichaelLhommeOFP you should be able to use the iptables in /var/lib/rancher/k3s/data/current/bin/aux/ to manually add whatever rules you need, although obviously the system-level iptables init scripts and such will not be available. Not having the system iptables available may also be breaking self-managed Docker containers. Time to move those standalone Docker containers into K3s?

Yes indeed !

MichaelLhommeOFP commented 3 years ago

Has anyone ever tried if this bug still exist on Debian 11?

Running smoothly for 2 days now and the CPU load remains constant, quite happy with the result !

unixfox commented 2 years ago

Is there other people that tested on Debian 11 too?

paleozogt commented 2 years ago

I am seeing this on Centos 8 Stream (which seems to contradict @dereknola).

I'm seeing huge cpu usage out of iptables:

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 869583 root      20   0 3983232   3.8g   1280 R 100.0   6.5  12:50.17 iptables
 869589 root      20   0 4012032   3.8g   1216 R 100.0   6.5  12:45.58 iptables-save
 872400 root      20   0 2400000   2.3g   1280 R 100.0   3.9   9:56.07 iptables
 872405 root      20   0 2405568   2.3g   1216 R 100.0   3.9  10:04.28 iptables-save
 872734 root      20   0 1953408   1.9g   1216 R 100.0   3.2   8:14.76 iptables-save
 873033 root      20   0 1444608   1.4g   1216 R 100.0   2.3   6:38.79 iptables-save
 873325 root      20   0 1009920 977.2m   1280 R 100.0   1.6   4:56.34 iptables
 873332 root      20   0  994752 988416   1216 R 100.0   1.6   4:56.75 iptables-save
 873621 root      20   0  727488 720384   1280 R 100.0   1.2   3:10.54 iptables
 867403 root      20   0 6476352   6.2g   1216 R  95.5  10.6  20:39.00 iptables-save
 868469 root      20   0 5619264   5.3g   1216 R  95.5   9.2  18:19.22 iptables-save
 868478 root      20   0 5661504   5.4g   1280 R  95.5   9.2  18:11.14 iptables
 868892 root      20   0 5041536   4.8g   1216 R  95.5   8.2  16:25.83 iptables-save
 869210 root      20   0 4511616   4.3g   1280 R  95.5   7.4  14:39.87 iptables
 869231 root      20   0 4560576   4.3g   1216 R  95.5   7.4  14:58.03 iptables-save
 873625 root      20   0  725376 717696   1216 R  95.5   1.2   3:15.53 iptables-save

Running iptables -L shows thousands of duplicated lines like this:

ACCEPT     all  --  anywhere             anywhere             /* rule to explicitly ACCEPT traffic that comply to network policies */ mark match 0x20000/0x20000
ACCEPT     all  --  anywhere             anywhere             /* rule to explicitly ACCEPT traffic that comply to network policies */ mark match 0x20000/0x20000
ACCEPT     all  --  anywhere             anywhere             /* rule to explicitly ACCEPT traffic that comply to network policies */ mark match 0x20000/0x20000
ACCEPT     all  --  anywhere             anywhere             /* rule to explicitly ACCEPT traffic that comply to network policies */ mark match 0x20000/0x20000

It's so many that the iptables output is nearly 50MB:

~ sudo iptables -L > iptables.txt
~ ls -lh iptables.txt
-rw-rw-r-- 1 admin admin 49M Nov  1 08:02 iptables.txt

The iptables vesion:

~ iptables -V
iptables v1.8.4 (nf_tables)

ChristianCiach commented 2 years ago

We are seeing the exact same issue with CentOS 8 (non "Stream"). Everything looks exactly the same as described by @paleozogt. Even the output of iptables -V is the same.

Uninstalling nftables on the host is not really an option for us :(

brandond commented 2 years ago

as noted above in https://github.com/k3s-io/k3s/issues/3117#issuecomment-810617920, this is a bug in the iptables-nft package provided by your distro.

ChristianCiach commented 2 years ago

I know. I just wanted to confirm that this is still an issue, even on CentOS 8.

paleozogt commented 2 years ago

I'm not sure what the fix is since we can't uninstall iptables-nft. Build a new version?

ChristianCiach commented 2 years ago

@paleozogt I am not really recommending the following, but...

K3s appends the directory {{ k3s_data_dir }}/data/current/bin/aux to the PATH at runtime (see source code file /cmd/k3s/main.go). In theory, we should be able to workaround this issue by prepending the directory {{ k3s_data_dir }}/data/current/bin/aux to the PATH, for example by manipulating the systemd service file.

This is probably bad advice for multiple reasons, but I wanted to write down this idea anyway.

vrince commented 2 years ago

brandond work around (unintalling iptable sudo apt remove iptables nftables -y) worked perfectly (ubuntu 20.04 / k3s 1.21). Thanks for that! Maybe some check could be added to k3s.sh install script to check installed version of iptable ? Cause this issue produce a decent amount of noise and give a bad feeling about k3s straight after install.

ffly90 commented 2 years ago

We are also seeing this problem on Rocky 8.5:

$ cat /etc/os-release
NAME="Rocky Linux"
VERSION="8.5 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.5"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.5 (Green Obsidian)"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
ROCKY_SUPPORT_PRODUCT="Rocky Linux"
ROCKY_SUPPORT_PRODUCT_VERSION="8"

$ rpm -qa|grep table
python3-nftables-0.9.3-21.el8.x86_64
iptables-1.8.4-20.el8.x86_64
nftables-0.9.3-21.el8.x86_64
iptables-libs-1.8.4-20.el8.x86_64
iptables-ebtables-1.8.4-20.el8.x86_64

$ iptables -L | wc -l
# Warning: iptables-legacy tables present, use iptables-legacy to see them
192187

chris93111 commented 2 years ago

Hi someone success try to use the embedded iptables rather than the os ?

Not possible to remove the standard iptables 1.8.4 on RHEL/Centos

Xaero12 commented 2 years ago

@firefly-serenity, for rocky linux if your running k3s and do not need the network policy you can append --disable-network-policy to your service file, flush the rules that are already there with iptables --flush, afterwards start k3s and the iptables rules shouldn't be duplicating any further. If you are using network policies, it doesn't appear that rocky has updated their package for iptables, atleast i couldn't find one at the time of writing this, but sticking that disable in the k3s.service file did the trick for me.

ffly90 commented 2 years ago

@Xaero12 thank you for the hint. I'm still looking for a solution that does not require disabling network policies.

@brandond

as noted above in #3117 (comment), this is a bug in the iptables-nft package provided by your distro.

Unfortunately this is not the case. As shown here: #3117 (comment2) your reproduction does not apply for v1.8.4 on CentOS. It might be a totally different bug.

ffly90 commented 2 years ago

We had another incident tonight. When going through the OS logs, we found this interesting section:

Mar 27 15:15:34 redacted k3s[170294]: E0327 15:15:34.277055  170294 network_policy_controller.go:269] Aborting sync. Failed to run iptables-restore: exit status 1 (iptables-restore: line 607275 failed
[...]
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -m comment --comment "flanneld forward" -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -m comment --comment "flanneld forward" -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kube-router netpol - REDACTED" -j KUBE-ROUTER-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kube-router netpol - REDACTED" -j KUBE-ROUTER-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -m comment --comment "flanneld forward" -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -m comment --comment "flanneld forward" -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kube-router netpol - REDACTED" -j KUBE-ROUTER-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kube-router netpol - REDACTED" -j KUBE-ROUTER-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -s 10.42.0.0/16 -m comment --comment "flanneld forward" -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -d 10.42.0.0/16 -m comment --comment "flanneld forward" -j ACCEPT
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kube-router netpol - REDACTED" -j KUBE-ROUTER-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
Mar 27 15:15:34 redacted k3s[170294]: -A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
[...]

This happened lots of times and the section I put in here repeats hundreds of times.

thibault-ketterer commented 2 years ago

We had another incident tonight. When going through the OS logs, we found this interesting section: [...] This happened lots of times and the section I put in here repeats hundreds of times.

got this problem got rocky linux don't want to disable network policies

Have you find any solution for now ? I'm available to help

ffly90 commented 2 years ago

We had another incident tonight. When going through the OS logs, we found this interesting section: [...] This happened lots of times and the section I put in here repeats hundreds of times.

got this problem got rocky linux don't want to disable network policies

Have you find any solution for now ? I'm available to help

@thibault-ketterer we are still at the same point. We implemented a monitoring check that notifies us if the rules are bloating again. In those cases we reboot the affected systems. It is not ideal but the only way to provide a stable system at the moment.

brandond commented 2 years ago

I'm not sure why distros are still shipping broken versions of iptables, but this but is a bug in the OS packaged versions of iptables. If you want to use the working versions included with K3s, you can uninstall the OS packages and restart K3s.

ChristianCiach commented 2 years ago

you can uninstall the OS packages and restart K3s

If it just were this simple. Our sysadmins would slap me if I were to uninstall this package.

Seeing that this issue only appears when using the network policy controller of kube-router and nobody has yet been able to even find a ticket about this issue upstream (at iptables/nftables), wouldn't it be a better idea to add a workaround for this issue to kube-router? I am sure this is easier said than done, but this is really annoying and nobody even knows if or when the distros will finally ship a fixed iptables.

Edit: Maybe it would be easier to workaround this issue if k3s had some kind of option to make it use the bundled host tools preferably, instead of as a fallback.

brandond commented 2 years ago

@manuelbuil since this seems to be only affecting the kube-router managed rules, is it possible there's an issue on that side?

k3s-io / k3s

High load due to ksoftirqd, growing iptables rules #3117