k8snetworkplumbingwg / multus-cni

A CNI meta-plugin for multi-homed pods in Kubernetes
Apache License 2.0
2.27k stars 576 forks source link

Macvlan not working with calico VXLAN #1246

Closed EugenMayer closed 2 days ago

EugenMayer commented 3 months ago

Setup

My k8s host (k3s) running calico (with network namespace support) as the base CNI. My host has 2 interfaces

I am runnin

k8s host:

ip a show eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc fq_codel state UP group default qlen 1000
    link/ether bc:24:11:c1:4b:1d brd ff:ff:ff:ff:ff:ff
    altname enp0s19
    inet 10.10.12.10

I use the rke2-multus chart to install multus and add the following NetworkAttachmentDefinition definition

{
            "name": "iot-macvlan",
            "cniVersion": "0.3.1",
            "type": "macvlan",
            "master": "eth1",
            "mode": "bridge",
            "ipam": {
              "type": "host-local",
              "subnet": "10.10.12.0/24",
              "rangeStart": "10.10.12.200",
              "rangeEnd": "10.10.12.250",
              "gateway": "10.10.12.1"
            }
}

My test-pod gets the (expected) annotation k8s.v1.cni.cncf.io/networks: multus/iot-macvlan@iot

Analysis

Shell'ing into the pod it all seems to be setup just right

root@network-tools-6f49496bcd-mfzrd:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0@if57: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1390 qdisc noqueue state UP group default qlen 1000
    link/ether 82:d7:5e:9c:b2:98 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.12.5.171/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::80d7:5eff:fe9c:b298/64 scope link 
       valid_lft forever preferred_lft forever
3: iot@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default 
    link/ether 96:7d:51:f8:2b:72 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.10.12.200/24 brd 10.10.12.255 scope global iot
       valid_lft forever preferred_lft forever
    inet6 fe80::947d:51ff:fef8:2b72/64 scope link 
       valid_lft forever preferred_lft forever
root@network-tools-6f49496bcd-mfzrd:/# ip route
default via 169.254.1.1 dev eth0 
10.10.12.0/24 dev iot proto kernel scope link src 10.10.12.200 
169.254.1.1 dev eth0 scope link 

Trying to ping the host interface or any other device in the 10.10.12.0/24 network does not work. Using tcpdump on the k8s-host i do not even see any icmp packets flowing anywhere.

Also trying to ping the pod interface from the k8s-host does not work, running tcpdump on the pod does not reveal anything.

What is working

What is not working

No access to the network attached via the multus interface (10.10.12.0/24)

Assumptions

Thank you for any hints / help


EugenMayer commented 3 months ago

I usually only use network policies for incoming traffic into a namespace, so

resource "kubernetes_network_policy" "policy" {
  metadata {
    name      = "default-deny-ingress"
    namespace = var.namespace
  }

  spec {
    policy_types = ["Ingress"]
    pod_selector {}
    ingress {
      from {
        namespace_selector {
          match_labels = {
            "kubernetes.io/metadata.name" = var.namespace
          }
        }
      }
    }
  }
}

but i do not control egress at all. Thus i would assume it not to be affected. I disabled any ingress/egress policies for the test-pod network in second test - nothing changed. Still not network access.

EugenMayer commented 3 months ago

Found #1023 which might be related (possible). Not sure it talked about calico using IPLAN or multus. In my case, calico is using VXLAN, while multus is using MACVLAN.

Still trying to puzzle it together, any help apprtiated

EugenMayer commented 3 months ago

I tried reduce the complexity to understand which parts are actually working Using this as my NetworkDef. (deployed in the multus namespace)

{
            "name": "iot-host",
            "cniVersion": "0.3.1",
            "type": "host-device",
         "device": "eth1"
}

with this annotation on the pod

annotations:
        k8s.v1.cni.cncf.io/networks: 'multus/iot-host'

lets the pod properly access the network as expected. So it might be macvlan related.

Next would be creating a bridge for eth1 and use a tap network on the pods.

EugenMayer commented 3 months ago

tried ipvlan with

{
            "name": "iot-ipvlan",
            "cniVersion": "0.3.1",
            "type": "ipvlan",
            "master": "eth1",
            "ipam": {
              "type": "host-local",
              "subnet": "10.10.12.0/24",
              "rangeStart": "10.10.12.200",
              "rangeEnd": "10.10.12.250",
              "gateway": "10.10.12.1",
              "routes": [ {"dst": "240.0.0.0/4"} ]
            }
}
annotations:
        k8s.v1.cni.cncf.io/networks: 'multus/iot-ipvlan'

which does not work. Pods have (still) no access to 10.10.12.0/24

EugenMayer commented 3 months ago

To explore if callicos encapsulation could be the cause i switched from VXLAN to

neither of these worked with macvlan/ipvlan

EugenMayer commented 3 months ago

Failed to test with the tap based setup due to #1247

EugenMayer commented 3 months ago

Reading through #229 i ensure that

tcpdump -vvvnnpi eth1 
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
19:30:52.954371 IP (tos 0x0, ttl 64, id 65062, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.12.201 > 10.10.12.16: ICMP echo request, id 4, seq 0, length 64
19:30:53.955506 IP (tos 0x0, ttl 64, id 65063, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.12.201 > 10.10.12.16: ICMP echo request, id 4, seq 1, length 64
tcpdump -vvvnnpi eth1 arp
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
19:31:20.447538 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.12.1 tell 10.10.12.16, length 42
19:31:25.512969 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.12.16 tell 10.10.12.201, length 28
19:31:25.563050 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.10.12.16 is-at 60:74:f4:3f:d0:e6, length 42

There are new findings here, esp that i actually no see the ICMP on my host interface. Maybe this has do to with the switch to IPIP.

Since my k3s-node runs on proxmox, i started to debug with tcpdump on the physical interface there

tcpdump -vvvnnpi enp1s0 dst 10.10.12.16 and icmp and -e vlan

When i ping from the k3s-host (which is working) it looks like this

20:42:26.177442 bc:24:11:c1:4b:1d > 60:74:f4:3f:d0:e6, ethertype 802.1Q (0x8100), length 102: vlan 14, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 35682, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.12.10 > 10.10.12.16: ICMP echo request, id 21, seq 1, length 64
20:42:27.179171 bc:24:11:c1:4b:1d > 60:74:f4:3f:d0:e6, ethertype 802.1Q (0x8100), length 102: vlan 14, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 35742, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.12.10 > 10.10.12.16: ICMP echo request, id 21, seq 2, length 64

Doing the same from the pod it looks like (which is not working)

20:43:01.775966 3e:04:c8:89:df:8a > 60:74:f4:3f:d0:e6, ethertype 802.1Q (0x8100), length 102: vlan 14, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 48942, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.12.201 > 10.10.12.16: ICMP echo request, id 22, seq 0, length 64
20:43:02.777056 3e:04:c8:89:df:8a > 60:74:f4:3f:d0:e6, ethertype 802.1Q (0x8100), length 102: vlan 14, p 0, ethertype IPv4 (0x0800), (tos 0x0, ttl 64, id 49162, offset 0, flags [DF], proto ICMP (1), length 84)
    10.10.12.201 > 10.10.12.16: ICMP echo request, id 22, seq 1, length 64

In both cases the packages have the correct vlan tag and look basically the same. Do i miss something obvious?

trying to ping 10.10.12.1 which is my gateway using the pod, i can see that the packages actually reach my gateway and are returned

tcpdump -i em1_vlan14 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1_vlan14, link-type EN10MB (Ethernet), capture size 262144 bytes
20:48:29.260584 IP 10.10.12.201 > 10.10.12.1: ICMP echo request, id 27, seq 68, length 64
20:48:30.260640 IP 10.10.12.201 > 10.10.12.1: ICMP echo request, id 27, seq 69, length 64

So in the end, this is not an issue for the packages to reach out, the package rather does not return back to the pod.

EugenMayer commented 3 months ago

Really stuck here, would appropriate any hints or crumbles to debug further

EugenMayer commented 3 months ago

Well even though this issue queue seems to be a monolog for everybody, maybe somebody runs into this anyway.

After understanding that i did no properly "redeploy" calico with IPIP, i remade the cluster with IPIP and activated BGP as required.

All the macvlan/ipip setups above are working without an issue if doing so. So in fact, the entire issue comes down to using calicos VXLAN encapsulation. Switching to IPIP simply fixed the issues

dougbtv commented 3 months ago

This might be more related to macvlan + calico and not so much Multus as it's an interaction with the delegated plugins.

I'll leave it open for now in case anyone has input, but it also might be worth filing against calico or potentially macvlan CNI plugins.

github-actions[bot] commented 1 week ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.