Closed disconn3ct closed 2 days ago
What else should I do to get this merged?
Sorry for the long time it took to review it. I added some comments above. Most seems fine. I guess you explicitly disabled some features, which are implicitly enabled with their parent configs? IMO this is not needed. If we enable a parent feature, it is IMO fine to have the kernel enable additional sub features of it, according to its defaults. It might be confusing, if we tailor it too much, a way that users see a kernel module or feature, but not fully functional the way they are used to on plain Debian or other distros.
Basically everything came from getting sudo k3s check-config
to green.
It was a while ago and I don't remember what magic I used to get the config delta. I think it was menuconfig and the built-in config diff. (Looks like it was 6.10.11.)
Let me undo the questioning parts, and try with that kernel. If it really does not work, we can check again. But I cannot imagine that it requires features, which are not even available in the official Debian kernel, and that it requires some otherwise enabled features to be explicitly disabled. Not sure how this config check tool works, but before we are doing weird stuff, we should at least fully understand why/what for it thinks it requires this.
The commit history is transparent here, so we can recover any of this, if needed. But let's go with this for now, basically matching the well known and much used Debian and RPi kernel builds. I'll create an image from this ASAP.
@disconn3ct can you test with this kernel build: https://dietpi.com/downloads/binaries/testing/ Or with the respective image here: https://dietpi.com/downloads/images/testing/
Crashloop.
calico-node-m6xg4 calico-node 2024-11-22 13:02:59.842 [WARNING][76] felix/int_dataplane.go 2162: Failed to synchronize routing table, will retry...
calico-node-m6xg4 calico-node 2024-11-22 13:02:59.943 [INFO][76] felix/wireguard.go 1704: Trying to connect to linkClient ipVersion=0x4
calico-node-m6xg4 calico-node 2024-11-22 13:02:59.944 [INFO][76] felix/route_rule.go 189: Trying to connect to netlink
calico-node-m6xg4 calico-node 2024-11-22 13:02:59.946 [ERROR][76] felix/route_rule.go 248: Failed to list routing rules, retrying... error=operation not supported ipVersion=4
calico-node-m6xg4 calico-node 2024-11-22 13:02:59.947 [WARNING][76] felix/int_dataplane.go 2162: Failed to synchronize routing table, will retry...
calico-node-m6xg4 calico-node 2024-11-22 13:03:00.051 [INFO][76] felix/wireguard.go 1704: Trying to connect to linkClient ipVersion=0x4
calico-node-m6xg4 calico-node 2024-11-22 13:03:00.052 [INFO][76] felix/route_rule.go 189: Trying to connect to netlink
calico-node-m6xg4 calico-node 2024-11-22 13:03:00.054 [ERROR][76] felix/route_rule.go 248: Failed to list routing rules, retrying... error=operation not supported ipVersion=4
And k3s check-config
(trimmed):
Optional Features:
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: missing
- CONFIG_RT_GROUP_SCHED: missing
- Network Drivers:
- "overlay":
- CONFIG_VXLAN: enabled (as module)
Optional (for encrypted networks):
- CONFIG_CRYPTO: enabled
- CONFIG_CRYPTO_AEAD: enabled
- CONFIG_CRYPTO_GCM: enabled (as module)
- CONFIG_CRYPTO_SEQIV: enabled (as module)
- CONFIG_CRYPTO_GHASH: enabled (as module)
- CONFIG_XFRM: enabled
- CONFIG_XFRM_USER: enabled (as module)
- CONFIG_XFRM_ALGO: enabled (as module)
- CONFIG_INET_ESP: enabled (as module)
- CONFIG_INET_XFRM_MODE_TRANSPORT: missing
But the features listed as "missing" are all also listed as "optional", and do not seem at all related to listing routing rules. Can you point me to the source code of this felix/route_rule.go
script to check what exactly it tries to do?
Found it: https://github.com/projectcalico/calico/blob/master/felix/routerule/route_rule.go#L244 So about listing netlink routing tables, which does not seem to be related to these 3 options. I'll check.
EDIT:
root@SOQuartz:~# ip rule
RTNETLINK answers: Operation not supported
Dump terminated
Yeah, that is an issue.
I think I found them:
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
needed for any further routing capabilities, also when providing a hotspot or AP etc. Somewhat essential, also enabled in Debian and RPi kernel. I'll rebuild the kernel with these.
So that works now:
root@SOQuartz:~# ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
@disconn3ct can you test again the new kernel build (same directory)? I did not rebuild images, but can do so, if it makes things easier for you.
Add more missing configs for Calico. With this config it joins the mesh successfully and seems to be working normally.