projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
6.01k stars 1.34k forks source link

Calico panics if kube-proxy using nftables mode #8025

Closed blampe closed 2 months ago

blampe commented 1 year ago

Expected Behavior

One of my nodes is emitting a warning about incompatible nft rules and then panic-looping while failing to log something.

Current Behavior

2023-09-17 16:22:39.290 [WARNING][12833] felix/table.go 765: iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"
2023-09-17 16:22:39.290 [PANIC][12833] felix/table.go 771: iptables-nft-save command failed after retries ipVersion=0x4 table="filter"
panic: (*logrus.Entry) 0x400035db90

goroutine 152 [running]:
github.com/sirupsen/logrus.(*Entry).log(0x40001424d0, 0x0, {0x4000b84b70, 0x2e})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:260 +0x4c0
github.com/sirupsen/logrus.(*Entry).Log(0x40001424d0, 0x0, {0x4000a669e8?, 0x1?, 0x1?})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:304 +0x60
github.com/sirupsen/logrus.(*Entry).Logf(0x40001424d0, 0x0, {0x2bced89?, 0x6?}, {0x4000a66ab8?, 0x3?, 0x0?})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:349 +0x88
github.com/sirupsen/logrus.(*Entry).Panicf(...)
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:387
github.com/projectcalico/calico/felix/iptables.(*Table).getHashesAndRulesFromDataplane(0x40002206c0)
    /go/src/github.com/projectcalico/calico/felix/iptables/table.go:771 +0x2f4
github.com/projectcalico/calico/felix/iptables.(*Table).loadDataplaneState(0x40002206c0)
    /go/src/github.com/projectcalico/calico/felix/iptables/table.go:608 +0x158
github.com/projectcalico/calico/felix/iptables.(*Table).Apply(0x40002206c0)
    /go/src/github.com/projectcalico/calico/felix/iptables/table.go:992 +0x290
github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply.func4(0x7f5b6825c0?)
    /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2120 +0x48
created by github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply
    /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2119 +0xf18

Possible Solution

The error doesn't suggest how to remove the incompatible rules. I've tried nft flush ruleset but the problem consistently comes back.

Steps to Reproduce (for bugs)

1. 2. 3. 4.

Context

Your Environment

lwr20 commented 1 year ago

One of my nodes is emitting a warning about incompatible nft rules and then panic-looping while failing to log something.

Not quite, Felix trying to get the iptables rules from your system using iptables-nft-save, but that command is failing. After retrying, felix gives up and falls on its sword in an attempt to recover. See https://github.com/projectcalico/calico/blob/master/felix/iptables/table.go#L750.

So the question is, what are the "incompatible entries" in the filter table that iptables-nft-save doesn't like? Has felix chosen to use nft-tables on your system incorrectly? And what is creating the incompatible entries?

Can you get the dump of iptables rules from that system and add them here please?

tomastigera commented 1 year ago

What is the version of iptables-nft-save on your system? Have you installed any rules manually?

bchappat commented 1 year ago

We are Also facing the Same Issue in the below environment.

Operating System : Debian Bookworm 12.1
Kernel version : 6.1.0-12-amd64
Canal Version : v3.24.5
Canal Iptables version : v1.8.6 (nf_tables)
RKE2 Version : v1.26.0+rke2r2
Debian Bookworm Host Iptables Version : 1.8.9

Please let us know if there is any resolution.

On observation that we have seen is if we flush the IP tables we are not seeing this issue.

The nft list ruleset command output looks like the below.

nft-list-ruleset-bookworm.txt

tomastigera commented 1 year ago

Calico uses iptables 1.8.4 and it may lead to incompatibility with the other versions of iptables in the system. It needs some investigation. Could you build a calico-node image with 1.8.9 and test it out perhaps?

https://github.com/projectcalico/calico/blob/release-v3.24/node/Dockerfile.amd64#L16C18-L16C26

Tallitsch commented 1 year ago

We have a k8s cluster running on RHEL 9.2, using nftables, and canal (image calico 3.26.1/flanel 0.21.4). The canal daemon-set attempts to have 2 ready for each worker node and you will see that it can only have 1 of 2 running. As an FYI, iptables is depricated in RHEL 9 and Canal and firewalld don't play well.

The following in my ruleset for nft is the issue regardless of syntax. Runs great without it.

add rule ip filter INPUT ct state vmap { established : accept, related : accept, invalid : drop }

or

add rule ip filter INPUT ct state vmap { established | related : accept, invalid : drop }

The rule causes the exact behavior described above.

Best regards

Tallitsch commented 1 year ago

I will have to update this. Over the weekend without the rule 3 nodes turned "Not Ready", thus this doesn't appear to be a specific rule. It also must be randon as there 5 other nodes working just fine.

wszgrcy commented 1 year ago

My Environment Ubuntu 22.04 LTS (GNU/Linux 5.15.0-56-generic x86_64) v1.28.3+k3s1 in docker (host network) one node Strangely, it seems that the network can still be accessed Sometimes it will display success, and then after a few seconds, it will return to 0/1

2023-11-02 02:21:45.846 [ERROR][118812] felix/table.go 857: iptables-save failed because there are incompatible nft rules in the table.  Remove the nft rules to continue. ipVersion=0x4 table="filter"
2023-11-02 02:21:45.846 [WARNING][118812] felix/table.go 806: Killing iptables-nft-save process after a failure error=iptables-save failed because there are incompatible nft rules in the table
2023-11-02 02:21:45.847 [WARNING][118812] felix/table.go 765: iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"
2023-11-02 02:21:45.847 [PANIC][118812] felix/table.go 771: iptables-nft-save command failed after retries ipVersion=0x4 table="filter"
panic: (*logrus.Entry) 0xc0001dc310

goroutine 195 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc00007b730, 0x0, {0xc00074c6c0, 0x2e})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:260 +0x491
github.com/sirupsen/logrus.(*Entry).Log(0xc00007b730, 0x0, {0xc00098aa08?, 0x1?, 0x1?})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:304 +0x48
github.com/sirupsen/logrus.(*Entry).Logf(0xc00007b730, 0x0, {0x2f2210f?, 0x6?}, {0xc00098aad0?, 0xc0005b4a80?, 0xc0001250d0?})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:349 +0x7c
github.com/sirupsen/logrus.(*Entry).Panicf(...)
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:387
github.com/projectcalico/calico/felix/iptables.(*Table).getHashesAndRulesFromDataplane(0xc0007c7200)
    /go/src/github.com/projectcalico/calico/felix/iptables/table.go:771 +0x3db
github.com/projectcalico/calico/felix/iptables.(*Table).loadDataplaneState(0xc0007c7200)
    /go/src/github.com/projectcalico/calico/felix/iptables/table.go:608 +0x192
github.com/projectcalico/calico/felix/iptables.(*Table).Apply(0xc0007c7200)
    /go/src/github.com/projectcalico/calico/felix/iptables/table.go:992 +0x392
github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply.func4(0xc0004f8960?)
    /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2120 +0x4c
created by github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply in goroutine 105
    /go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2119 +0x12e6
W1102 02:21:45.914422  118901 feature_gate.go:241] Setting GA feature gate ServiceInternalTrafficPolicy=true. It will be removed in a future release.
# in calico-node pod
[root@VM-24-12-ubuntu /]# calico-node  -felix-ready
W1102 02:27:14.887857  140384 feature_gate.go:241] Setting GA feature gate ServiceInternalTrafficPolicy=true. It will be removed in a future release.
calico/node is not ready: felix is not ready: readiness probe reporting 503

solved: I apt upgrade and reboot then running........maybe I only need reboot system?

zzvara commented 10 months ago

We are having the same issue after a cluster upgrade.

Operating system

NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3760.1.1
VERSION_ID=3760.1.1
BUILD_ID=2023-12-11-2212
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3760.1.1 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:3760.1.1:*:*:*:*:*:*:*"
iptables -V
iptables v1.8.8 (nf_tables)

iptables rules on each node

cat /var/lib/iptables/rules-save
*filter

-F INPUT
-P INPUT DROP

-A INPUT  -i lo -j ACCEPT
-A OUTPUT -o lo -j ACCEPT

-A INPUT -i br0 -m state --state ESTABLISHED,RELATED -j ACCEPT

-A INPUT -s 224.0.0.0/4 -j DROP
-A INPUT -s 240.0.0.0/5 -j DROP
-A INPUT -s 255.255.255.255 -j DROP
-A INPUT -d 0.0.0.0 -j DROP
-A INPUT -s 0.0.0.0/8 -j DROP
-A INPUT -s 169.254.0.0/16 -j DROP
-A INPUT -s 192.0.2.0/24 -j DROP
-A INPUT -s 224.0.0.0/3 -j DROP

-A INPUT -s 10.0.0.0/8 -j ACCEPT
-A INPUT -s 172.16.0.0/12 -j ACCEPT
-A INPUT -s 192.168.0.0/16 -j ACCEPT

-A INPUT -i br0 -p tcp --dport 22 -j ACCEPT
-A INPUT -i br0 -p tcp --dport 80 -j ACCEPT
-A INPUT -i br0 -p tcp --dport 443 -j ACCEPT
-A INPUT -i br0 -p tcp --dport 30001:32767 -j ACCEPT
-A INPUT -i br0 -p icmp --icmp-type 0 -j ACCEPT
-A INPUT -i br0 -p icmp --icmp-type 3 -j ACCEPT
-A INPUT -i br0 -p icmp --icmp-type 11 -j ACCEPT

-I INPUT ! -s 10.0.0.0/8  -p tcp --dport 22 -i br0 -m state --state NEW -m recent --set
-I INPUT -p tcp --dport 22 -i br0 -m state --state NEW -m recent --update --seconds 60 --hitcount 2 -j REJECT
-A FORWARD -p tcp --syn -m limit --limit 1/s -j ACCEPT
-A FORWARD -p tcp --tcp-flags SYN,ACK,FIN,RST RST -m limit --limit 1/s -j ACCEPT
-A FORWARD -p icmp --icmp-type echo-request -m limit --limit 1/s -j ACCEPT

Kubernetes

Installed with Kubespray from master branch of commit aea150e. https://github.com/kubernetes-sigs/kubespray/commit/aea150e5dc244e933c6d5e2aee35ffb7ffe614a9

This installs Calico with settings:

---
# see roles/network_plugin/calico/defaults/main.yml

# the default value of name
# @note By default, it should be "k8s-pod-network",
#       however, ours is `cni0`.
# @see `cat /etc/cni/net.d/calico.conflist.template`
# @see [https://github.com/kubernetes-sigs/kubespray/issues/8810]
calico_cni_name: cni0

## With calico it is possible to distributed routes with border routers of the datacenter.
## Warning : enabling router peering will disable calico's default behavior ('node mesh').
## The subnets of each nodes will be distributed by the datacenter router
# peer_with_router: false

# Enables Internet connectivity from containers
# nat_outgoing: true

# Enables Calico CNI "host-local" IPAM plugin
# calico_ipam_host_local: true

# add default ippool name
# calico_pool_name: "default-pool"

# add default ippool blockSize (defaults kube_network_node_prefix)
calico_pool_blocksize: 24

# add default ippool CIDR (must be inside kube_pods_subnet, defaults to kube_pods_subnet otherwise)
# calico_pool_cidr: 1.2.3.4/5

# add default ippool CIDR to CNI config
# calico_cni_pool: true

# Add default IPV6 IPPool CIDR. Must be inside kube_pods_subnet_ipv6. Defaults to kube_pods_subnet_ipv6 if not set.
# calico_pool_cidr_ipv6: fd85:ee78:d8a6:8607::1:0000/112

# Add default IPV6 IPPool CIDR to CNI config
# calico_cni_pool_ipv6: true

# Global as_num (/calico/bgp/v1/global/as_num)
# global_as_num: "64512"

# If doing peering with node-assigned asn where the globas does not match your nodes, you want this
# to be true.  All other cases, false.
# calico_no_global_as_num: false

# You can set MTU value here. If left undefined or empty, it will
# not be specified in calico CNI config, so Calico will use built-in
# defaults. The value should be a number, not a string.
# calico_mtu: 1500

# Configure the MTU to use for workload interfaces and tunnels.
# - If Wireguard is enabled, subtract 60 from your network MTU (i.e 1500-60=1440)
# - Otherwise, if VXLAN or BPF mode is enabled, subtract 50 from your network MTU (i.e. 1500-50=1450)
# - Otherwise, if IPIP is enabled, subtract 20 from your network MTU (i.e. 1500-20=1480)
# - Otherwise, if not using any encapsulation, set to your network MTU (i.e. 1500)
# calico_veth_mtu: 1440

# Advertise Cluster IPs
# calico_advertise_cluster_ips: true

# Advertise Service External IPs
# calico_advertise_service_external_ips:
# - x.x.x.x/24
# - y.y.y.y/32

# Advertise Service LoadBalancer IPs
# calico_advertise_service_loadbalancer_ips:
# - x.x.x.x/24
# - y.y.y.y/16

# Choose data store type for calico: "etcd" or "kdd" (kubernetes datastore)
# @see [https://github.com/kubernetes-sigs/kubespray/issues/8917#issuecomment-1200224234]
calico_datastore: "etcd"

# Choose Calico iptables backend: "Legacy", "Auto" or "NFT"
# calico_iptables_backend: "Auto"

# Use typha (only with kdd)
# typha_enabled: false

# Generate TLS certs for secure typha<->calico-node communication
# typha_secure: false

# Scaling typha: 1 replica per 100 nodes is adequate
# Number of typha replicas
# typha_replicas: 1

# Set max typha connections
# typha_max_connections_lower_limit: 300

# Set calico network backend: "bird", "vxlan" or "none"
# bird enable BGP routing, required for ipip and no encapsulation modes
# @note We stay here for better compatibility. This shall be upgraded later.
calico_network_backend: bird

# IP in IP and VXLAN is mutualy exclusive modes.
# set IP in IP encapsulation mode: "Always", "CrossSubnet", "Never"
# @note We stay here for better compatibility. This shall be upgraded later.
calico_ipip_mode: 'Always'

# set VXLAN encapsulation mode: "Always", "CrossSubnet", "Never"
# @note We stay here for better compatibility. This shall be upgraded later.
calico_vxlan_mode: 'Never'

# set VXLAN port and VNI
# calico_vxlan_vni: 4096
# calico_vxlan_port: 4789

# Enable eBPF mode
# calico_bpf_enabled: false

# If you want to use non default IP_AUTODETECTION_METHOD, IP6_AUTODETECTION_METHOD for calico node set this option to one of:
# * can-reach=DESTINATION
# * interface=INTERFACE-REGEX
# see https://docs.projectcalico.org/reference/node/configuration
# calico_ip_auto_method: "interface=eth.*"
# calico_ip6_auto_method: "interface=eth.*"

# Set FELIX_MTUIFACEPATTERN, Pattern used to discover the host’s interface for MTU auto-detection.
# see https://projectcalico.docs.tigera.io/reference/felix/configuration
# calico_felix_mtu_iface_pattern: "^((en|wl|ww|sl|ib)[opsx].*|(eth|wlan|wwan).*)"

# Choose the iptables insert mode for Calico: "Insert" or "Append".
# calico_felix_chaininsertmode: Insert

# If you want use the default route interface when you use multiple interface with dynamique route (iproute2)
# see https://docs.projectcalico.org/reference/node/configuration : FELIX_DEVICEROUTESOURCEADDRESS
# calico_use_default_route_src_ipaddr: false

# Enable calico traffic encryption with wireguard
# calico_wireguard_enabled: false

# Under certain situations liveness and readiness probes may need tunning
# calico_node_livenessprobe_timeout: 10
# calico_node_readinessprobe_timeout: 10

# Calico apiserver (only with kdd)
# calico_apiserver_enabled: false

Calico v3.26.4.

Panic log

2024-01-01 15:52:18.217 [ERROR][628323] felix/table.go 857: iptables-save failed because there are incompatible nft rules in the table.  Remove the nft rules to continue. ipVersion=0x4 table="filter"
2024-01-01 15:52:18.217 [WARNING][628323] felix/table.go 806: Killing iptables-nft-save process after a failure error=iptables-save failed because there are incompatible nft rules in the table
2024-01-01 15:52:18.217 [WARNING][628323] felix/table.go 765: iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"
2024-01-01 15:52:18.321 [ERROR][628323] felix/table.go 857: iptables-save failed because there are incompatible nft rules in the table.  Remove the nft rules to continue. ipVersion=0x4 table="filter"
2024-01-01 15:52:18.321 [WARNING][628323] felix/table.go 806: Killing iptables-nft-save process after a failure error=iptables-save failed because there are incompatible nft rules in the table
2024-01-01 15:52:18.322 [WARNING][628323] felix/table.go 765: iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"
2024-01-01 15:52:18.525 [ERROR][628323] felix/table.go 857: iptables-save failed because there are incompatible nft rules in the table.  Remove the nft rules to continue. ipVersion=0x4 table="filter"
2024-01-01 15:52:18.525 [WARNING][628323] felix/table.go 806: Killing iptables-nft-save process after a failure error=iptables-save failed because there are incompatible nft rules in the table
2024-01-01 15:52:18.525 [WARNING][628323] felix/table.go 765: iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"
2024-01-01 15:52:18.928 [ERROR][628323] felix/table.go 857: iptables-save failed because there are incompatible nft rules in the table.  Remove the nft rules to continue. ipVersion=0x4 table="filter"
2024-01-01 15:52:18.929 [WARNING][628323] felix/table.go 806: Killing iptables-nft-save process after a failure error=iptables-save failed because there are incompatible nft rules in the table
2024-01-01 15:52:18.929 [WARNING][628323] felix/table.go 765: iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"
2024-01-01 15:52:18.929 [PANIC][628323] felix/table.go 771: iptables-nft-save command failed after retries ipVersion=0x4 table="filter"
panic: (*logrus.Entry) 0xc00046ae70

goroutine 314 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc00036fe30, 0x0, {0xc00113ac60, 0x2e})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/entry.go:260 +0x491

However, the cluster itself seems to be operational, but we are worried.

zzvara commented 10 months ago

We are having the same issue after a cluster upgrade.

...

However, the cluster itself seems to be operational, but we are worried.

I tried the following since I posted the issue.

The nodes now look stable.

image

When the issue persisted, I observed that:

I updated the k8s-net-calico.yml in Kubespray inventory variables so that future upgrades to the cluster will reflect the changes:

# Choose Calico iptables backend: "Legacy", "Auto" or "NFT"
# This may be set back to `Auto` once the underlying issue is fixed/found.
# @see [https://github.com/projectcalico/calico/issues/8025]
calico_iptables_backend: "Legacy"
jorhett commented 9 months ago

I've lost several days iterating through this problem, and it's unsolveable without a complete rewrite of the NFT support in Calico.

The underlying problem here is that instead of adding real nftables support, it was added by using the iptables emulation layer. If any other subsystem on the node makes use of nft features incompatible with iptables, calico-node breaks entirely and ceases to work.

iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"

At this time the current versions of all of the following make nft-specific changes to the rules which will cause calico to break:

There is no solution, so people are being forced to switch away from Calico to restore their kubernetes cluster networking

zzvara commented 9 months ago

I've lost several days iterating through this problem, and it's unsolveable without a complete rewrite of the NFT support in Calico.

The underlying problem here is that instead of adding real nftables support, it was added by using the iptables emulation layer. If any other subsystem on the node makes use of nft features incompatible with iptables, calico-node breaks entirely and ceases to work.

iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="filter"

At this time the current versions of all of the following make nft-specific changes to the rules which will cause calico to break:

  • docker
  • CRI
  • kubernetes (specifically kube-proxy)

There is no solution, so people are being forced to switch away from Calico to restore their kubernetes cluster networking

Set FELIX_IPTABLESBACKEND from Auto to Legacy and you are fixed.

bmckercher123 commented 9 months ago

Apologies to all users on this thread that our documentation failed to provide the solution using FELIX_IPTABLESBACKEND. Doc/Ops team is looking at the best places (probably several) to ensure no one has to struggle with this again.

jorhett commented 9 months ago

Set FELIX_IPTABLESBACKEND from Auto to Legacy and you are fixed.

That would be a very odd definition of "fixed" -- you must be referring to usage which means castrated? 😉

If my kernel is using nftables then even if I could run iptables and nftables side by side, why in the world would I want to have that confusion? And most modern distro releases don't even have the legacy option available any more.

The move to nftables is approaching a decade old. Calico needs to update away from classic iptables before there's no kernels left that support it.

Apologies to all users on this thread that our documentation failed to provide the solution using FELIX_IPTABLESBACKEND.

The documentation makes clear how AUTO and NFT options work. This isn't the problem. The problem is that your NFT support is still using iptables commands. It's not really NFT support, it's a passthrough to an emulator that tries to present the nftables in iptables output. Which fails with even simple native nft tables.

When set to NFT, you should be using nft commands, not iptables commands.

bmckercher123 commented 9 months ago

You raise good points that are being reviewed. I agree "fixed" was not the best choice of words here. As a writer, I can only help avoid churn, frustration, and time lost troubleshooting for other users until a proper solution is in place.

jorhett commented 9 months ago

Oh, I took no offense to your use of the word. As a writer myself, I tried to play with the word to make it clear I was laughing so that I didn't come off too intensely critical.

Yes, the situation is complex, especially when supporting multiple generations of kernels in heterogeneous environments, and I know it's been tricky for projects to find the right balance of embracing nft while continuing to support iptables. I'm just trying to push that doing the investment in pure nftables support is necessary at this point, now that other projects have made that investment and the tables are no longer backwards compatible with iptables.

zzvara commented 9 months ago

I apologize for my confusion. Could some of you elaborate on some of the deep technical points raised here?

That would be a very odd definition of "fixed" -- you must be referring to usage which means castrated? 😉

If my kernel is using nftables then even if I could run iptables and nftables side by side, why in the world would I want to have that confusion?

And most modern distro releases don't even have the legacy option available any more.

This would significantly improve my understanding so I can be on the same level as some of you. I appreciate any help you can provide.

tomastigera commented 9 months ago

I apologize for my confusion. Could some of you elaborate on some of the deep technical points raised here?

That would be a very odd definition of "fixed" -- you must be referring to usage which means castrated? 😉

  • How does setting FELIX_IPTABLESBACKEND to Legacy is castration? Did I miss something here?

If my kernel is using nftables then even if I could run iptables and nftables side by side, why in the world would I want to have that confusion?

  • How do you define confusion here? Is this a metaphor? Could you give examples?

Under the hood you would run one, but yes, it may lead to some incompatibilities (perhaps referred here as confusion).

And most modern distro releases don't even have the legacy option available any more.

  • What does "modern" mean here?

This would significantly improve my understanding so I can be on the same level as some of you. I appreciate any help you can provide.

Moder means that newer versions do not come with compatibility packages between iptables and nftables.

As @bmckercher123 said, we are looking into this issue and we will address it one way or another. Thanks for reporting the issue and apologies for the current troubles.

fasaxc commented 9 months ago

Sorry @tomastigera @zzvara, setting the mode to legacy is not a solution to this problem. The current best answer is to use iptables-nft for all your components until we get a proper nftables backend in place. Using a mix of legacy iptables and nftables doesn't fail (assuming your kernel supports both) but the behaviour is very counter-intuitive. nftables can "undo" the verdict made by iptables-legacy so your policy may not get properly enforced and the failures will be confusing.

I understand the desire to jump to "proper" nftables mode ASAP but please bear in mind that kubernetes nftables mode is in Alpha in v1.29. It's not ready for prod use either.

We've been relying on the itpables-nft translation layer for a long time, which has meant that we're in sync with kube-proxy. If we moved to native nftables before kube-proxy then we'd have caused the same problem for kube-proxy!

Clearly, now that kube-proxy has nftables support, we also need to add it ASAP in order to remain in sync. I for one didn't spot that nftables support was on the slate for v1.29.

jorhett commented 9 months ago

The current best answer is to use iptables-nft

Which does not work, as the issue reports here and as I and other have reported. iptables-nft fails when anything that iptables cannot express is in the nft tables, and every other project involved in kubernetes is now adding rules that are incompatible.

Using a mix of legacy iptables and nftables doesn't fail [...] so your policy may not get properly enforced and the failures will be confusing.

Sounds like the definition of failure to me. It works only in very limited circumstances and debugging it is as confusing as hell.

I understand the desire to jump to "proper" nftables mode ASAP but please bear in mind that kubernetes nftables mode is in Alpha in v1.29. It's not ready for prod use either.

  1. Modern distros are dropping iptables legacy support
  2. All other packages are now taking advantage of nft's features, rendering nft tables incompatible with the emulation

We've been relying on the itpables-nft translation layer for a long time, which has meant that we're in sync with kube-proxy. If we moved to native nftables before kube-proxy then we'd have caused the same problem for kube-proxy!

kube-proxy 1.23+ is what is creating nft tables that iptables-nft can't parse, and causing calico to fail.

Clearly, now that kube-proxy has nftables support, we also need to add it ASAP in order to remain in sync.

Yes, this is the core problem.

fasaxc commented 9 months ago

Which does not work, as the issue reports here and as I and other have reported. iptables-nft fails when anything that iptables cannot express is in the nft tables, and every other project involved in kubernetes is now adding rules that are incompatible.

Yes, there are two issues here:

kube-proxy 1.23+ is what is creating nft tables that iptables-nft can't parse, and causing calico to fail.

Hopefully, this falls under needing to bump the iptables version to support the latest version of the compatibility shim so hopefully we can get a fix for that out soon.

Unfortunately, that fix won't make kube-proxy nftables mode work.

caseydavenport commented 8 months ago

We need to bump iptables version because the iptables-nft shim that kube-proxy etc is using has been updated.

https://github.com/projectcalico/calico/pull/8416 updates the version of the compatibility layer we include in Calico, and so should solve this first bullet point and make Calico compatible with kube-proxy when both are running in iptables-nft compatibility mode.

As @fasaxc suggested above, in order to support compatibility with other users of nftables we will likely need to stop depending on the itpables-nft compatibility layer. I'll be looking into this.

luniHw commented 6 months ago

Hi team, is there any workaround for this issue?

fasaxc commented 5 months ago

@luniHw Yes, the workaround is to not use kube-proxy in nftables mode with Calico!

caseydavenport commented 4 months ago

Quick update - just merged a Calico nftables dataplane implementation compatible with nftables kube-proxy here: https://github.com/projectcalico/calico/pull/8780

tech-preview support is currently scheduled for Calico v3.29.0, and a GA release will come sometime after that once we determine it to be sufficiently stable.

caseydavenport commented 2 months ago

I'm going to close this for now. With the nftables dataplane mentioned in my previous comment arriving in Calico v3.29, you should be able to run Calico with the (now beta) nftables kube-proxy mode.

Any further compatibility issues between the two should be handled in distinct issues. Thanks all.