bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.78k stars 519 forks source link

v1.26.2 - kube-proxy - Failed to execute iptables-restore - unknown option "--xor-mark" #4295

Open MrFishFinger opened 22 hours ago

MrFishFinger commented 22 hours ago

Image I'm using: v1.26.2 (linux kernel 5.15.168)

What I expected to happen: kube-proxy to operate without errors

What actually happened: kube-proxy repeatedly throws the error:

I1113 16:15:07.908800 1 proxier.go:853] "Syncing iptables rules"
I1113 16:15:07.928773 1 proxier.go:1464] "Reloading service iptables data" numServices=0 numEndpoints=0 numFilterChains=4 numFilterRules=3 numNATChains=4 numNATRules=5
E1113 16:15:07.931291 1 proxier.go:1481] "Failed to execute iptables-restore" err=<
exit status 2: ip6tables-restore v1.8.4 (legacy): unknown option "--xor-mark"
Error occurred at line: 16
Try `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.
>
I1113 16:15:07.931308 1 proxier.go:858] "Sync failed" retryingTime="30s"
I1113 16:15:07.931317 1 proxier.go:820] "SyncProxyRules complete" elapsed="22.67239ms"

How to reproduce the problem:

  1. add a "v1.26.2" bottlerocket node to an ipv4 EKS cluster running K8s 1.24
  2. check "kube-proxy" logs
  3. observe error

NOTE: rolling back to image "v1.26.1" (using linux kernel 5.15.167) fixes the issue.


details from 1.26.2 node with issue:

[ssm-user@control]$ cat /etc/*rel*
NAME=Bottlerocket
ID=bottlerocket
VERSION="1.26.2 (aws-k8s-1.24)"
...
bash-5.1# uname -a
Linux ip-x-x-x-x.eu-west-1.compute.internal 5.15.168 #1 SMP Fri Nov 1 22:54:32 UTC 2024 x86_64 GNU/Linux

info from 1.26.1 node without issue:

[ssm-user@control]$ cat /etc/*rel*
NAME=Bottlerocket
ID=bottlerocket
VERSION="1.26.1 (aws-k8s-1.24)"
...
bash-5.1# uname -a
Linux ip-x-x-x-x.eu-west-1.compute.internal 5.15.167 #1 SMP Thu Oct 24 18:28:21 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Sparksssj commented 21 hours ago

Thanks for the detailed report. I'm working on reproducing the issue now. Will update once I have more information.