netscaler / netscaler-k8s-node-controller

Integrates the hardware/virtualized NetScaler with the Kubernetes overlay / underlay
40 stars 9 forks source link

Calico network plugin shown as undefined #45

Closed tomasvaclavik closed 1 month ago

tomasvaclavik commented 3 months ago

Hello, I've started doing POC for our onprem deployment of Kubernetes cluster using ADC as ingress point but i got stuck with Citrix Node Controller crash-looping, after spending some time debugging I've found, that the problem is with CNC not detecting calico plugin, instead returning undefined and not configuring correct routes on ADC.

Kubernetes version: v1.30.2 Calico Vesion: 3.28, no BGP image : quay.io/citrix/citrix-k8s-node-controller:2.2.12

According to Issue #31 it should work with VXLAN in Always mode.

[root@K8S ~]# kubectl get configmaps -n kube-system kube-cnc-router -o yaml
apiVersion: v1
data:
  CNI-10.131.61.152: ""
  CNI-10.131.61.154: ""
  EndpointIP: 172.18.3.254
  Host-k8loc: 10.131.61.154
  Host-supploc: 10.131.61.152
  Interface-10.131.61.152: 172.18.3.2
  Interface-10.131.61.154: 172.18.3.1
  Mac-10.131.61.152: 82:ed:b3:27:b9:42
  Mac-10.131.61.154: e6:1c:69:88:86:be
  Node-10.131.61.152: 10.131.61.152
  Node-10.131.61.154: 10.131.61.154
  Type: VXLAN
  VTEP: calicoundefined
kind: ConfigMap
metadata:
  creationTimestamp: "2024-07-23T14:06:43Z"
  name: kube-cnc-router
  namespace: kube-system
  resourceVersion: "26696950"
  uid: 0100a125-9ab5-47b5-8980-6dfb0eeb1045

Node interfaces (shortened)

calif821254bd00: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::ecee:eeff:feee:eeee  prefixlen 64  scopeid 0x20<link>
        ether ee:ee:ee:ee:ee:ee  txqueuelen 1000  (Ethernet)
        RX packets 10228133  bytes 3431808938 (3.1 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8285370  bytes 4570886037 (4.2 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

cncvxlan73964: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.18.3.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::ce2:63ff:fef7:61d6  prefixlen 64  scopeid 0x20<link>
        ether 0e:e2:63:f7:61:d6  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 9  bytes 544 (544.0 B)
        TX errors 0  dropped 3 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.131.61.154  netmask 255.255.255.0  broadcast 10.131.61.255
        ether 00:15:5d:6f:6e:03  txqueuelen 1000  (Ethernet)
        RX packets 10228133  bytes 3431808938 (3.1 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8285371  bytes 4570886423 (4.2 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 121393893  bytes 22238890413 (20.7 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 121393893  bytes 22238890413 (20.7 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vxlan.calico: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1400
        inet 10.20.215.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::6488:45ff:fe83:9b1a  prefixlen 64  scopeid 0x20<link>
        ether 66:88:45:83:9b:1a  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Ippoolconfig

[root@K8S ~]# kubectl describe ippool default-ipv4-ippool
Name:         default-ipv4-ippool
Namespace:
Labels:       app.kubernetes.io/managed-by=tigera-operator
Annotations:  <none>
API Version:  projectcalico.org/v3
Kind:         IPPool
Metadata:
  Creation Timestamp:  2024-05-09T15:17:19Z
  Resource Version:    23929283
  UID:                 d2d68549-3e54-476a-916e-d1d4b5cfd50e
Spec:
  Allowed Uses:
    Workload
    Tunnel
  Block Size:     26
  Cidr:           10.20.0.0/16
  Ipip Mode:      Never
  Nat Outgoing:   true
  Node Selector:  all()
  Vxlan Mode:     Always
Events:           <none>

Nodes

[root@K8S ~]# kubectl get nodes -o wide
NAME               STATUS   ROLES           AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION          CONTAINER-RUNTIME
k8s.loc    Ready    control-plane   145d   v1.30.2   10.131.61.154   <none>        CentOS Stream 9                5.14.0-474.el9.x86_64   containerd://1.7.19
supp.loc   Ready    <none>          145d   v1.30.2   10.131.61.152   <none>        CentOS Stream 9                5.14.0-474.el9.x86_64   containerd://1.7.19

Controller log

I0723 13:58:45.771742       1 k8sInterface.go:144] [INFO] CONFIG MAP UPDATE EVENT: Adding new config
I0723 13:58:45.771779       1 k8sInterface.go:145] [INFO] Key Node-10.131.61.154 Value 10.131.61.154
I0723 13:58:45.771797       1 k8sInterface.go:152] [INFO] Update: Host Route Iterface IP: 172.18.3.1 Mac: 26:1c:4f:02:d3:e0 CNI-Route
E0723 13:58:45.771811       1 k8sInterface.go:157] [ERROR] Could not fetch Network, need enhancements[]
F0723 13:58:45.771832       1 k8sInterface.go:64] invalid CIDR address:

router log

[root@K8S ~]# kubectl logs kube-cnc-router-k8loc -n kube-system
CNI Name is calico
ip link delete cncvxlan73964
Host Interface eth0
CNI Interface undefined
ip link add cncvxlan73964 type vxlan id 175  dev eth0  dstport 8472
ip link set up dev cncvxlan73964
ip addr add 172.18.3.1/24 dev cncvxlan73964
InterfaceMac 66:9e:b2:ff:fe:be
VTEP Address 172.18.3.1
Host IP Address 10.131.61.154
Device "undefined" does not exist.
Device "undefined" does not exist.
CNI IP Address
CNI IP Prefix /26
CNI Addr
bridge fdb add 00:00:00:00:00:00 dev cncvxlan73964 dst 10.131.61.156
iptables -I INPUT 1 -p udp --dport 8472 -j ACCEPT

Is there some bug with the new Calico version or some other problem, I'm not seeing?

tomasvaclavik commented 1 month ago

Nevermind, restarted nodes, redeployed the same citrix-k8s-node-controller.yaml and started working.