High CPU load due to XDP program in iptables

dzacball commented 5 months ago

Expected Behavior

Calico using a reasonable amount of CPU. No XDP/BPF related error logs in calico-node.

Current Behavior

Calico using about 10x times it's usual CPU. In calico-node logs, I can see messages like

[INFO][60] felix/int_dataplane.go 2154: Applying XDP actions did not succeed, disabling XDP error=failed to resync: failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error loading BTF: Invalid argument(22)

Possible Solution

If I rebuild calico-node using an up-to-date version of bpftool (tried with v7.4 and worked), the issue is gone.

Steps to Reproduce (for bugs)

Not sure about how to reproduce. It happens in our env.. and it was introduced by updates to map definitions in https://github.com/projectcalico/calico/pull/8610.

Context

We can not use affected versions of Calico due to the high CPU load (and of course we aren't sure what other issues could arise)

Your Environment

Calico v3.27.3 and/or v3.28.0. K8s 1.28 + Ubuntu 20.04.6 workers with 5.4.0-177-generic kernel.

More details

I could confirm that this issue was introduced by this PR: https://github.com/projectcalico/calico/pull/8610. It is present in Calico v3.27.3 and v3.28.0. If I revert the PR, and build my own calico-node image on top of v3.28.0, the issue is gone. Also, if I rebuild calico-node with an up-to-date version of bpftool (tried with v7.4 and worked), the issue is gone.

dzacball commented 5 months ago

Just to make sure it's not specific to a combination of HEPs/GNPs/NetworkSets, I created a cluster with v3.27.3 and deleted all related resources. After that, I restarted calico-node, and the issue is still present:

$ k get networksets.crd.projectcalico.org -A
No resources found

$ k get hostendpoints.crd.projectcalico.org
No resources found

$ k get globalnetworkpolicies.crd.projectcalico.org
No resources found

$ k get networkpolicies.crd.projectcalico.org -A
No resources found

$ k get networkpolicies.networking.k8s.io -A
No resources found

$ k rollout restart daemonset -n kube-system calico-node
daemonset.apps/calico-node restarted

$ k get po -n kube-system | grep calico
calico-kube-controllers-5b784b4785-7898t             1/1     Running   0          91m
calico-node-gk7ft                                    1/1     Running   0          78s
calico-node-ngz97                                    1/1     Running   0          2m59s
calico-typha-5667cc7f45-d7lds                        0/1     Pending   0          91m
calico-typha-5667cc7f45-kp9c4                        1/1     Running   0          91m
calico-typha-5667cc7f45-l5pxg                        1/1     Running   0          91m

$ k top po -n kube-system | grep calico
calico-kube-controllers-5b784b4785-7898t             3m           16Mi            
calico-node-gk7ft                                    285m         125Mi          <-- CPU should be much lower, about 1/10th of this
calico-node-ngz97                                    247m         132Mi          <-- CPU should be much lower, about 1/10th of this 
calico-typha-5667cc7f45-kp9c4                        3m           20Mi            
calico-typha-5667cc7f45-l5pxg                        2m           21Mi            

$ k logs -n kube-system calico-node-ngz97 --since 1s
Defaulted container "calico-node" out of: calico-node, install-cni (init)
2024-05-24 08:24:24.200 [INFO][59] felix/xdp_state.go 200: Retrying after an XDP update failure...
2024-05-24 08:24:24.244 [INFO][59] felix/xdp_state.go 200: Retrying after an XDP update failure...
2024-05-24 08:24:24.289 [INFO][59] felix/xdp_state.go 200: Retrying after an XDP update failure...
2024-05-24 08:24:24.332 [INFO][59] felix/xdp_state.go 200: Retrying after an XDP update failure...
2024-05-24 08:24:24.372 [INFO][59] felix/xdp_state.go 200: Retrying after an XDP update failure...
2024-05-24 08:24:24.428 [INFO][59] felix/int_dataplane.go 2154: Applying XDP actions did not succeed, disabling XDP error=failed to resync: failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error loading BTF: Invalid argument(22)
libbpf: magic: 0xeb9f
version: 1
flags: 0x0
...
...
...

Worker env:

# cat /etc/os-release  | grep VERSION
VERSION="20.04.6 LTS (Focal Fossa)"
VERSION_ID="20.04"
VERSION_CODENAME=focal

# uname -r
5.4.0-182-generic

spacegaucho commented 5 months ago

Could you check your syslog for any networkd-dispatcher, NetworkManager, ethtool or systemd-udevd flooding? I just experienced something similar, where my master nodes didn't show any particular kubernetes process loading up the node (at a glance on k9s), but the node itself was at x10 the usual load (top), after upgrading from 3.25 to 3.28. Don't think my problem is BPF related.

This, apparently, only affected my on premise arch.

Specs

focal
systemd 245 (245.4-4ubuntu3.23)
kernel: 5.4.0-152-generic
k8s: v1.25.12

Syslog

May 24 13:23:08 master networkd-dispatcher[688]: ERROR:Unknown interface index 1419614 seen even after reload
May 24 13:23:08 master networkd-dispatcher[688]: WARNING:Unknown index 1419616 seen, reloading interface list
May 24 13:23:08 master systemd-udevd[701481]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
[...]
May 24 13:23:08 master systemd-udevd[701481]: ethtool: could not get ethtool features for calico_tmp_B
May 24 13:23:08 master systemd-udevd[701481]: Could not set offload features of calico_tmp_B: No such device

dzacball commented 5 months ago

@spacegaucho similar stuff here:

# cat /var/log/syslog | tail -n 10
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 networkd-dispatcher[785]: WARNING:Unknown index 450967 seen, reloading interface list
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 systemd-udevd[10804]: calico_tmp_B: Failed to get link config: No buffer space available
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 systemd-udevd[10802]: calico_tmp_A: Failed to get link config: No buffer space available
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 networkd-dispatcher[785]: WARNING:Unknown index 450969 seen, reloading interface list
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 systemd-udevd[10804]: calico_tmp_B: Failed to get link config: No buffer space available
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 systemd-udevd[10802]: calico_tmp_A: Failed to get link config: No buffer space available
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 systemd-networkd[679]: calico_tmp_B: Failed to wait for the interface to be initialized: No such device
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 networkd-dispatcher[785]: WARNING:Unknown index 450971 seen, reloading interface list
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 systemd-udevd[10802]: calico_tmp_A: Failed to get link config: No buffer space available
May 24 14:39:03 pres-cp83edv10o3dr35lr8g0-calidebug-default-00000228 systemd-udevd[10804]: calico_tmp_B: Failed to get link config: No buffer space available

Could you check your calico-node logs to see if ther's anything like the logs below?

[INFO][60] felix/int_dataplane.go 2154: Applying XDP actions did not succeed, disabling XDP error=failed to resync: failed to load BPF program (/usr/lib/calico/bpf/filter.o): stat /sys/fs/bpf/calico/xdp/prefilter_v1_calico_tmp_A: no such file or directory
libbpf: Error loading BTF: Invalid argument(22)

spacegaucho commented 5 months ago

@dzacball looks like we are onto something here.

No, there's nothing referencing the events you mentioned:

k stern -n kube-system calico-node-* --since=48h --no-follow | grep 'XDP actions did not succeed'
+ calico-node-zlbwb › calico-node
+ calico-node-gm87s › calico-node
+ calico-node-4fp5p › calico-node
+ calico-node-vkh8q › calico-node
+ calico-node-pphxf › calico-node
+ calico-node-cdp9k › calico-node
+ calico-node-8smsj › calico-node
+ calico-node-nfww9 › calico-node
+ calico-node-lptlx › calico-node
- calico-node-cdp9k › calico-node
- calico-node-4fp5p › calico-node
- calico-node-vkh8q › calico-node
- calico-node-nfww9 › calico-node
- calico-node-zlbwb › calico-node
- calico-node-8smsj › calico-node
- calico-node-lptlx › calico-node
- calico-node-gm87s › calico-node
- calico-node-pphxf › calico-node

But the issue persists. Had to rollback to 3.25 and that "fixed" the issue. I'm browsing around for any additional information.

mazdakn commented 5 months ago

@spacegaucho can you possibly check if the issue is gone with upgrading bpftool?

spacegaucho commented 5 months ago

Sure! @mazdakn how should I go about that? My calico-node pods currently run the following:

bpftool --version
bpftool v5.3.0

Should I test a v3.29.0-0.dev image? Or would I need to update the binary manually inside the pods? Thanks.

mazdakn commented 5 months ago

@spacegaucho Thanks for being OK to test it. I'll provide you an image with the updated bpftool.

xloypaypa commented 5 months ago

I got similar issue here. Not very sure about the root cause. But once I upgraded my kernel from 5.4.0-182-generic to 5.15.0-107-generic. the issue disappeared.

referring to the kernel version here: https://ubuntu.com/security/livepatch/docs/livepatch/reference/kernels I what I did it: install the HWE kernel. (I'm using Ubuntu 20.04 LTS)

So, not sure if this issue is related to specific kernel version or it's actually related the HWE kernel

caseydavenport commented 5 months ago

xref https://github.com/projectcalico/calico/issues/8833

mazdakn commented 5 months ago

@spacegaucho here is an image with the updated bpftool (v7.4). Please give it a try, and let me know if it works for you.

mazdakrn/calico-node:latest

This is based on master branch, but should be OK to test it in a v3.28 cluster.

mazdakn commented 5 months ago

@dzacball @spacegaucho can you also disable XDP mode in iptables by setting xdpEnabled to false in felixconfiguration resource. This should prevent felix trying to load iptables XDP objects.

Ref: https://docs.tigera.io/calico/latest/reference/resources/felixconfig

dzacball commented 5 months ago

@mazdakn I can verify that if I set xdpEnabled to false, the issue is gone.

mazdakn commented 5 months ago

@dzacball can you also try the image I mentioned above in one of your test cluster?

dzacball commented 4 months ago

@mazdakn I tested your image - it works, issue is gone. (As I already mentioned, I also did some tests a few weeks back with a self-built image, using v3.28.0 + latest bpftool, worked as well)

mazdakn commented 4 months ago

@dzacball for testing.

spacegaucho commented 4 months ago

sorry for not replying earlier, will try it asap.

@dzacball @spacegaucho can you also disable XDP mode in iptables by setting xdpEnabled to false in felixconfiguration resource. This should prevent felix trying to load iptables XDP objects.

Ref: https://docs.tigera.io/calico/latest/reference/resources/felixconfig

spacegaucho commented 4 months ago

@spacegaucho here is an image with the updated bpftool (v7.4). Please give it a try, and let me know if it works for you.
mazdakrn/calico-node:latest
This is based on master branch, but should be OK to test it in a v3.28 cluster.

can confirm this fixed the issue for me as well.

started with 3.25, checked that there where no matches for XDP actions did not succeed string, as I tested in a previous post.
upgraded to 3.28 (using this upgrade path) and verified that the string was present and flooding.
modified the upgrade.yaml file to include the image provided by @dzacball, applied it and verified that after the image was set in place and the pods restarted, the log flooding stopped.

mazdakn commented 4 months ago

@spacegaucho thanks for testing and confirmation. Did you also managed to set xdpEnabled to false and see if the issue is gone?

spacegaucho commented 4 months ago

xdpEnabled

no, sorry, should i test that directly in vanilla 3.28?

mazdakn commented 4 months ago

Yes, with vanilla 3.28 image.

isugimpy commented 4 months ago

I notice that #8880 is flagged for 3.29, is it possible this will also get a 3.28 backport? I was about to go down the rabbit hole of switching to the eBPF datapath and exploring XDP, but don't want to run into this issue.

mazdakn commented 4 months ago

@isugimpy we will definitely backport it to v3.28, and also v3.27. However, this issue is not relevant to eBPF dataplane. The issue discussed here is about XDP used in iptables dataplane. eBPF dataplane is NOT affected.

isugimpy commented 4 months ago

Oh, brilliant. I didn't catch that this was unique to iptables! Thank you!

dzacball commented 4 months ago

@mazdakn Do you have an ETA for upcoming 3.27 and 3.28 releases that will already contain this fix PR?

mazdakn commented 4 months ago

@dzacball v3.27.4 is expected to be released early July (probably the first week). There is no ETA for 3.28 patch release yet.

dzacball commented 4 months ago

@mazdakn Thank you.

mazdakn commented 4 months ago

Closing since the fixes (including back ports to v3.27 and v3.28) are merged now, and will be available in the next patch releases.

mazdakn commented 4 months ago

@dzacball 3.28.1 is planned to be released in the second half of July.

dzacball commented 4 months ago

thanks for the heads-up

mihivagyok commented 3 months ago

@mazdakn Do you have any update when the release will arrive? Thank you!

mazdakn commented 3 months ago

@mihivagyok we are in the process of releasing 3.28.1 atm. It most likely will be released early next week.

projectcalico / calico