projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
6.02k stars 1.34k forks source link

ebpf mode can break host networking when calico-node rotates #6739

Closed calerogers closed 2 weeks ago

calerogers commented 2 years ago

When ebpf mode is enabled and upon calico-node pod termination, host level networking breaks. This appears to be due to the bpf prog_array map IDs for calico being zeroed out whilst the bpf ingress and egress filter rules remain causing DENY action on connections.

Expected Behavior

I rotate / delete a calico-node pod and a replacement pod should come up.

Current Behavior

I rotate / delete a calico-node pod and it appears as stuck terminating via kubectl. The host becomes unreachable over SSH and other connectivity is broken.

Possible Solution

This is not a proper solution for what is happening, but to remedy the broken state you can connect via serial console and run (with appropriate interface name): tc filter del dev ens5 ingress tc filter del dev ens5 egress and / or deleting the entire qdisc created by calico tc qdisc del dev ens5 clsact

Steps to Reproduce (for bugs)

  1. Create cluster in non-ebpf mode
  2. Switch to ebpf mode via https://projectcalico.docs.tigera.io/maintenance/ebpf/enabling-ebpf
  3. Rotate calico-node pod

Context

Here is a long thread with debug output, note this thread did not start out troubleshooting this exact situation, it switched to this particular behavior at this message: https://calicousers.slack.com/archives/CUKP5S64R/p1663607646628119?thread_ts=1663261785.681519&cid=CUKP5S64R

Your Environment

calerogers commented 2 years ago

It seems that mounting /sys/fs/bpf into the calico-node pod will also resolve this behavior!

log1cb0mb commented 1 year ago

I am running into same issue. I should mention that its not just about rotating calico-node pods or when upgrading from <3.24.x this seems to be happening with fresh cluster and calico install as well. the workaround with deleting qdisc worked for me as well.

log1cb0mb commented 1 year ago

Details from my setup and what tried before finding this issue:

After painstaking process and dozen attempts of installs/reinstalls, in/out config from calico deployment (tigera-operator based) with upgrade or even straight fresh install of calico 3.24.x ends up breaking networking on bare-metal nodes, it boils down to this particular behaviour:

https://docs.tigera.io/calico/3.25/operations/ebpf/troubleshoot-ebpf#check-if-a-program-is-dropping-packets

In same cluster which involves RHEL based VMs, all is good however on bare-metal RHEL nodes, as soon calico-node comes up (at stage where bird state to be Ready is reported), the host loses all network connectivity specifically for new connection while existing connections e.g ssh to nodes stay intact.

Initial investigation showed all inbound packets to node simply vanishing until discovered being dropped on physical interfaces. Its not just inbound connections from external hosts but also the connections that node itself generates, outbound packets getting through but return traffic..drop.

Now the articles say “check if program dropping packets” but does not say what to do if it does ?!? I cannot seem to find any other logs/error that would indicate the issue. I even stripped down the configuration from autocreating HEPs and policies for hosts. The only way to restore connectivity is to reboot the node and essentially not let calico-node get scheduled or run again on the node. I even tried wiping iptables etc, does not restore connectivity until host is rebooted.

tc -s qdisc show | grep clsact -A 2
qdisc clsact ffff: dev enp193s0f0 parent ffff:fff1
 Sent 150143 bytes 1313 pkt (dropped 298, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
--
qdisc clsact ffff: dev enp194s0f0 parent ffff:fff1
 Sent 251140 bytes 1416 pkt (dropped 335, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

Drop counter keeps increasing…

mazdakn commented 1 year ago

@calerogers we mount /sys/fs/bpf from host to calico-node, but that needs mount propagation to actually stick between different calico-node pods. Can you check it is enabled in your setup?