rancher / rke2

https://docs.rke2.io/
Apache License 2.0
1.52k stars 265 forks source link

Multi-Node-Cluster metric-server not reachable from other node. #5742

Closed kgts23 closed 5 months ago

kgts23 commented 5 months ago

Environmental Info: RKE2 Version: rke2 version v1.29.3+rke2r1 (1c82f7ed292c4ac172692bb82b13d20733909804) go version go1.21.8 X:boringcrypto

Node(s) CPU architecture, OS, and Version: x86_64, Ubuntu, 22.04.4 LTS

Cluster Configuration: 3 Server Nodes (for Rancher later) No CIS profile

root@k8s-node02:~# cat /etc/rancher/rke2/config.yaml
node-ip: 192.168.1.21
tls-san:
  - rancher.adm.dcsix
  - 10.100.1.20
root@k8s-node02:~# cat /etc/rancher/rke2/config.yaml
node-ip: 192.168.1.22
server: https://192.168.1.20:9345
token: ***
tls-san:
  - rancher.adm.dcsix
  - 10.100.1.20
root@k8s-node03:~# cat /etc/rancher/rke2/config.yaml
node-ip: 192.168.1.22
server: https://192.168.1.20:9345
token: ***
tls-san:
  - rancher.adm.dcsix
  - 10.100.1.20

Describe the bug: When calling up kubectl top nodes, I noticed that only every third request was answered via the load balancer (layer 4, round robin). I then ran kubectl top node on every single node and realized that only the node where the metrics server was also running as a pod was responding. I then set up a busybox and tried to reach the internal cluster IP of the metrics-server pod. Only the node itself, where metrics-server was deployed, responded.

Steps To Reproduce:

  1. Build HA Cluster (Node-IP)
  2. execute kubectl top nodes on every single node. Just 1 of them provides the wanted output.

Actual behavior: Only reachable on the node, where the metrics-server pod is running

Additional context / logs: From node2 (trying to reach metrics-server on node3)

E0412 09:38:04.933934       1 available_controller.go:460] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.42.2.21:10250/apis/metrics.k8s.io/v1beta1: Get "https://10.42.2.21:10250/apis/metrics.k8s.io/v1beta1": context deadline exceeded

Busybox from node02

/ # nslookup 10.42.2.21
Server:         10.43.0.10
Address:        10.43.0.10:53

21.2.42.10.in-addr.arpa name = 10-42-2-21.rke2-metrics-server.kube-system.svc.cluster.local

/ # ping 10.42.2.21 -c 3
PING 10.42.2.21 (10.42.2.21): 56 data bytes

--- 10.42.2.21 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
/ # telnet 10.42.2.21 10250
^C

Canal Log on node2 for the busybox:

k8s" workload_id:"default/busybox5" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali0641058f83e" profile_ids:"kns.default" profile_ids:"ksa.default.de
fault" ipv4_nets:"10.42.0.24/32" >
2024-04-12 09:45:11.988 [INFO][44] felix/endpoint_mgr.go 602: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defa
ult/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-tw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 595: Chain became referenced, marking it for programming chainName="cali-pri-kns.default" ipVersion=0x4 tabl
e="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 595: Chain became referenced, marking it for programming chainName="cali-pri-ksa.default.default" ipVersion=
0x4 table="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-fw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 595: Chain became referenced, marking it for programming chainName="cali-pro-kns.default" ipVersion=0x4 tabl
e="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 595: Chain became referenced, marking it for programming chainName="cali-pro-ksa.default.default" ipVersion=
0x4 table="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/endpoint_mgr.go 648: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/
busybox5", EndpointId:"eth0"}
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-from-wl-dispatch" ipVersion=0x4 table="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 595: Chain became referenced, marking it for programming chainName="cali-fw-cali0641058f83e" ipVersion=0x4 t
able="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-to-wl-dispatch" ipVersion=0x4 table="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/table.go 595: Chain became referenced, marking it for programming chainName="cali-tw-cali0641058f83e" ipVersion=0x4 t
able="filter"
2024-04-12 09:45:11.989 [INFO][44] felix/endpoint_mgr.go 1283: Skipping configuration of interface because it is oper down. ifaceName="cali0641058f83e"
2024-04-12 09:45:11.989 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=false status="
down" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:11.989 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="down" workload=proto.WorkloadEndpointID{O
rchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:12.004 [INFO][44] felix/status_combiner.go 78: Endpoint down for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Wo
rkloadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="down"
2024-04-12 09:45:12.004 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="down"
2024-04-12 09:45:12.054 [INFO][44] felix/int_dataplane.go 1387: Linux interface state changed. ifIndex=29 ifaceName="cali0641058f83e" state="down"
2024-04-12 09:45:12.054 [INFO][44] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=set.Set{} ifaceName="cali0641058f83e"
2024-04-12 09:45:12.054 [INFO][44] felix/int_dataplane.go 2011: Received interface update msg=&intdataplane.ifaceStateUpdate{Name:"cali0641058f83e", State:"do
wn", Index:29}
2024-04-12 09:45:12.055 [INFO][44] felix/int_dataplane.go 2031: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e",
 Addrs:set.Typed[string]{}}
2024-04-12 09:45:12.055 [INFO][44] felix/hostip_mgr.go 84: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e", Addrs:set.Ty
ped[string]{}}
2024-04-12 09:45:12.055 [INFO][44] felix/endpoint_mgr.go 431: Workload interface state changed; marking for status update. ifaceName="cali0641058f83e"
2024-04-12 09:45:12.055 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=false status="
down" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:12.055 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="down" workload=proto.WorkloadEndpointID{O
rchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:12.055 [INFO][44] felix/status_combiner.go 78: Endpoint down for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Wo
rkloadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="down"
2024-04-12 09:45:12.055 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="down"
2024-04-12 09:45:12.073 [INFO][44] felix/int_dataplane.go 1387: Linux interface state changed. ifIndex=29 ifaceName="cali0641058f83e" state="up"
2024-04-12 09:45:12.073 [INFO][44] felix/int_dataplane.go 2011: Received interface update msg=&intdataplane.ifaceStateUpdate{Name:"cali0641058f83e", State:"up
", Index:29}
2024-04-12 09:45:12.073 [INFO][44] felix/endpoint_mgr.go 374: Workload interface came up, marking for reconfiguration. ifaceName="cali0641058f83e"
2024-04-12 09:45:12.073 [INFO][44] felix/endpoint_mgr.go 431: Workload interface state changed; marking for status update. ifaceName="cali0641058f83e"
2024-04-12 09:45:12.073 [INFO][44] felix/endpoint_mgr.go 1215: Applying /proc/sys configuration to interface. ifaceName="cali0641058f83e"
2024-04-12 09:45:12.073 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="u
p" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:12.073 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{Orc
hestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:12.074 [INFO][44] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Work
loadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="up"
2024-04-12 09:45:12.074 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="up"
2024-04-12 09:45:13.990 [INFO][44] felix/iface_monitor.go 238: Netlink address update for known interface.  addr="fe80::ecee:eeff:feee:eeee" exists=true ifInd
ex=29
2024-04-12 09:45:13.990 [INFO][44] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=set.Set{fe80::ecee:eeff:feee:eeee} ifaceName="cali0641058
f83e"
2024-04-12 09:45:13.991 [INFO][44] felix/int_dataplane.go 2031: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e",
 Addrs:set.Typed[string]{"fe80::ecee:eeff:feee:eeee":set.v{}}}
2024-04-12 09:45:13.991 [INFO][44] felix/hostip_mgr.go 84: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e", Addrs:set.Ty
ped[string]{"fe80::ecee:eeff:feee:eeee":set.v{}}}
2024-04-12 09:45:22.481 [INFO][44] felix/calc_graph.go 507: Local endpoint updated id=WorkloadEndpoint(node=k8s-node01, orchestrator=k8s, workload=default/bus
ybox5, name=eth0)
2024-04-12 09:45:22.481 [INFO][44] felix/int_dataplane.go 1954: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"
k8s" workload_id:"default/busybox5" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali0641058f83e" profile_ids:"kns.default" profile_ids:"ksa.default.de
fault" ipv4_nets:"10.42.0.24/32" >
2024-04-12 09:45:22.481 [INFO][44] felix/endpoint_mgr.go 602: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defa
ult/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:22.481 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-tw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:22.481 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-fw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:22.481 [INFO][44] felix/endpoint_mgr.go 648: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/
busybox5", EndpointId:"eth0"}
2024-04-12 09:45:22.481 [INFO][44] felix/endpoint_mgr.go 1215: Applying /proc/sys configuration to interface. ifaceName="cali0641058f83e"
2024-04-12 09:45:22.482 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="u
p" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:22.482 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{Orc
hestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:22.489 [INFO][44] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Work
loadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="up"
2024-04-12 09:45:22.490 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="up"
2024-04-12 09:45:29.622 [INFO][44] felix/summary.go 100: Summarising 32 dataplane reconciliation loops over 1m2.8s: avg=9ms longest=49ms (resync-filter-v4,res
ync-mangle-v4,update-filter-v4)
2024-04-12 09:45:35.521 [INFO][44] felix/calc_graph.go 507: Local endpoint updated id=WorkloadEndpoint(node=k8s-node01, orchestrator=k8s, workload=default/bus
ybox5, name=eth0)
2024-04-12 09:45:35.521 [INFO][44] felix/int_dataplane.go 1954: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"
k8s" workload_id:"default/busybox5" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali0641058f83e" profile_ids:"kns.default" profile_ids:"ksa.default.de
fault" ipv4_nets:"10.42.0.24/32" >
2024-04-12 09:45:35.522 [INFO][44] felix/endpoint_mgr.go 602: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defa
ult/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:35.522 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-tw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:35.522 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-fw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:35.522 [INFO][44] felix/endpoint_mgr.go 648: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/
busybox5", EndpointId:"eth0"}
2024-04-12 09:45:35.522 [INFO][44] felix/endpoint_mgr.go 1215: Applying /proc/sys configuration to interface. ifaceName="cali0641058f83e"
2024-04-12 09:45:35.522 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="u
p" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:35.522 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{Orc
hestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:35.530 [INFO][44] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Work
loadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="up"
2024-04-12 09:45:35.530 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="up"
2024-04-12 09:45:37.506 [INFO][44] felix/calc_graph.go 507: Local endpoint updated id=WorkloadEndpoint(node=k8s-node01, orchestrator=k8s, workload=default/bus
ybox5, name=eth0)
2024-04-12 09:45:37.506 [INFO][44] felix/int_dataplane.go 1954: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"
k8s" workload_id:"default/busybox5" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali0641058f83e" profile_ids:"kns.default" profile_ids:"ksa.default.de
fault" ipv4_nets:"10.42.0.24/32" >
2024-04-12 09:45:37.506 [INFO][44] felix/endpoint_mgr.go 602: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defa
ult/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:37.507 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-tw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:37.507 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-fw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:45:37.507 [INFO][44] felix/endpoint_mgr.go 648: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/
busybox5", EndpointId:"eth0"}
2024-04-12 09:45:37.507 [INFO][44] felix/endpoint_mgr.go 1215: Applying /proc/sys configuration to interface. ifaceName="cali0641058f83e"
2024-04-12 09:45:37.508 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="u
p" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:37.508 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{Orc
hestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:45:37.516 [INFO][44] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Work
loadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="up"
2024-04-12 09:45:37.516 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="up"
2024-04-12 09:45:53.022 [INFO][44] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"k8s-node02"
 labels:<key:"beta.kubernetes.io/arch" value:"amd64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value
:"linux" > labels:<key:"kubernetes.io/arch" value:"amd64" > labels:<key:"kubernetes.io/hostname" value:"k8s-node02" > labels:<key:"kubernetes.io/os" value:"li
nux" > labels:<key:"node-role.kubernetes.io/control-plane" value:"true" > labels:<key:"node-role.kubernetes.io/etcd" value:"true" > labels:<key:"node-role.kub
ernetes.io/master" value:"true" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" >
2024-04-12 09:46:32.012 [INFO][44] felix/summary.go 100: Summarising 15 dataplane reconciliation loops over 1m2.4s: avg=7ms longest=15ms (resync-raw-v4)
2024-04-12 09:46:56.348 [INFO][44] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"k8s-node03"
 labels:<key:"beta.kubernetes.io/arch" value:"amd64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value
:"linux" > labels:<key:"kubernetes.io/arch" value:"amd64" > labels:<key:"kubernetes.io/hostname" value:"k8s-node03" > labels:<key:"kubernetes.io/os" value:"li
nux" > labels:<key:"node-role.kubernetes.io/control-plane" value:"true" > labels:<key:"node-role.kubernetes.io/etcd" value:"true" > labels:<key:"node-role.kub
ernetes.io/master" value:"true" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" >
2024-04-12 09:47:32.014 [INFO][44] felix/summary.go 100: Summarising 11 dataplane reconciliation loops over 1m0s: avg=6ms longest=12ms (resync-ipsets-v4)
2024-04-12 09:47:39.774 [INFO][44] felix/calc_graph.go 507: Local endpoint updated id=WorkloadEndpoint(node=k8s-node01, orchestrator=k8s, workload=default/bus
ybox5, name=eth0)
2024-04-12 09:47:39.774 [INFO][44] felix/int_dataplane.go 1954: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"
k8s" workload_id:"default/busybox5" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali0641058f83e" profile_ids:"kns.default" profile_ids:"ksa.default.de
fault" ipv4_nets:"10.42.0.24/32" >
2024-04-12 09:47:39.775 [INFO][44] felix/endpoint_mgr.go 602: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defa
ult/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:39.775 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-tw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:47:39.776 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-fw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:47:39.776 [INFO][44] felix/endpoint_mgr.go 648: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/
busybox5", EndpointId:"eth0"}
2024-04-12 09:47:39.776 [INFO][44] felix/endpoint_mgr.go 1215: Applying /proc/sys configuration to interface. ifaceName="cali0641058f83e"
2024-04-12 09:47:39.777 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="u
p" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:39.777 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{Orc
hestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:39.786 [INFO][44] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Work
loadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="up"
2024-04-12 09:47:39.786 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="up"
2024-04-12 09:47:41.003 [INFO][44] felix/calc_graph.go 507: Local endpoint updated id=WorkloadEndpoint(node=k8s-node01, orchestrator=k8s, workload=default/bus
ybox5, name=eth0)
2024-04-12 09:47:41.003 [INFO][44] felix/int_dataplane.go 1954: Received *proto.WorkloadEndpointUpdate update from calculation graph msg=id:<orchestrator_id:"
k8s" workload_id:"default/busybox5" endpoint_id:"eth0" > endpoint:<state:"active" name:"cali0641058f83e" profile_ids:"kns.default" profile_ids:"ksa.default.de
fault" ipv4_nets:"10.42.0.24/32" >
2024-04-12 09:47:41.003 [INFO][44] felix/endpoint_mgr.go 602: Updating per-endpoint chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defa
ult/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.003 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-tw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:47:41.003 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-fw-cali0641058f83e" ipVersion=0x4 table="filter"
2024-04-12 09:47:41.004 [INFO][44] felix/endpoint_mgr.go 648: Updating endpoint routes. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/
busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.004 [INFO][44] felix/endpoint_mgr.go 1215: Applying /proc/sys configuration to interface. ifaceName="cali0641058f83e"
2024-04-12 09:47:41.004 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=true failed=false known=true operUp=true status="u
p" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.004 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="up" workload=proto.WorkloadEndpointID{Orc
hestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.015 [INFO][44] felix/route_table.go 1185: Failed to access interface because it doesn't exist. error=Link not found ifaceName="cali0641058
f83e" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=254
2024-04-12 09:47:41.015 [INFO][44] felix/route_table.go 1253: Failed to get interface; it's down/gone. error=Link not found ifaceName="cali0641058f83e" ifaceR
egex="^cali.*" ipVersion=0x4 tableIndex=254
2024-04-12 09:47:41.015 [INFO][44] felix/route_table.go 589: Interface missing, will retry if it appears. ifaceName="cali0641058f83e" ifaceRegex="^cali.*" ipV
ersion=0x4 tableIndex=254
2024-04-12 09:47:41.016 [INFO][44] felix/status_combiner.go 81: Endpoint up for at least one IP version id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Work
loadId:"default/busybox5", EndpointId:"eth0"} ipVersion=0x4 status="up"
2024-04-12 09:47:41.016 [INFO][44] felix/status_combiner.go 98: Reporting combined status. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defau
lt/busybox5", EndpointId:"eth0"} status="up"
2024-04-12 09:47:41.063 [INFO][44] felix/calc_graph.go 505: Local endpoint deleted id=WorkloadEndpoint(node=k8s-node01, orchestrator=k8s, workload=default/bus
ybox5, name=eth0)
2024-04-12 09:47:41.064 [INFO][44] felix/int_dataplane.go 1954: Received *proto.WorkloadEndpointRemove update from calculation graph msg=id:<orchestrator_id:"
k8s" workload_id:"default/busybox5" endpoint_id:"eth0" >
2024-04-12 09:47:41.064 [INFO][44] felix/int_dataplane.go 1954: Received *proto.ActiveProfileRemove update from calculation graph msg=id:<name:"kns.default" >

2024-04-12 09:47:41.064 [INFO][44] felix/int_dataplane.go 1954: Received *proto.ActiveProfileRemove update from calculation graph msg=id:<name:"ksa.default.de
fault" >
2024-04-12 09:47:41.064 [INFO][44] felix/endpoint_mgr.go 703: Workload removed, deleting its chains. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Workloa
dId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 606: Chain no longer referenced, marking it for removal chainName="cali-pri-kns.default" ipVersion=0x4 table
="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 606: Chain no longer referenced, marking it for removal chainName="cali-pri-ksa.default.default" ipVersion=0
x4 table="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 606: Chain no longer referenced, marking it for removal chainName="cali-pro-kns.default" ipVersion=0x4 table
="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 606: Chain no longer referenced, marking it for removal chainName="cali-pro-ksa.default.default" ipVersion=0
x4 table="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/endpoint_mgr.go 558: Workload removed, deleting old state. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", Workload
Id:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-from-wl-dispatch" ipVersion=0x4 table="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 606: Chain no longer referenced, marking it for removal chainName="cali-fw-cali0641058f83e" ipVersion=0x4 ta
ble="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-to-wl-dispatch" ipVersion=0x4 table="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 606: Chain no longer referenced, marking it for removal chainName="cali-tw-cali0641058f83e" ipVersion=0x4 ta
ble="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 521: Queueing update of chain. chainName="cali-to-wl-dispatch" ipVersion=0x4 table="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/table.go 606: Chain no longer referenced, marking it for removal chainName="cali-tw-cali0641058f83e" ipVersion=0x4 ta
ble="filter"
2024-04-12 09:47:41.064 [INFO][44] felix/endpoint_mgr.go 490: Re-evaluated workload endpoint status adminUp=false failed=false known=false operUp=false status
="" workloadEndpointID=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.064 [INFO][44] felix/status_combiner.go 58: Storing endpoint status update ipVersion=0x4 status="" workload=proto.WorkloadEndpointID{Orche
stratorId:"k8s", WorkloadId:"default/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.065 [INFO][44] felix/route_table.go 1185: Failed to access interface because it doesn't exist. error=Link not found ifaceName="cali0641058
f83e" ifaceRegex="^cali.*" ipVersion=0x4 tableIndex=254
2024-04-12 09:47:41.065 [INFO][44] felix/route_table.go 1253: Failed to get interface; it's down/gone. error=Link not found ifaceName="cali0641058f83e" ifaceR
egex="^cali.*" ipVersion=0x4 tableIndex=254
2024-04-12 09:47:41.065 [INFO][44] felix/route_table.go 589: Interface missing, will retry if it appears. ifaceName="cali0641058f83e" ifaceRegex="^cali.*" ipV
ersion=0x4 tableIndex=254
2024-04-12 09:47:41.065 [INFO][44] felix/conntrack.go 90: Removing conntrack flows ip=10.42.0.24
2024-04-12 09:47:41.104 [INFO][44] felix/int_dataplane.go 1387: Linux interface state changed. ifIndex=29 ifaceName="cali0641058f83e" state="down"
2024-04-12 09:47:41.104 [INFO][44] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=set.Set{} ifaceName="cali0641058f83e"
2024-04-12 09:47:41.106 [INFO][44] felix/int_dataplane.go 1387: Linux interface state changed. ifIndex=29 ifaceName="cali0641058f83e" state=""
2024-04-12 09:47:41.106 [INFO][44] felix/int_dataplane.go 1431: Linux interface addrs changed. addrs=<nil> ifaceName="cali0641058f83e"
2024-04-12 09:47:41.106 [INFO][44] felix/iface_monitor.go 235: Netlink address update but interface isn't yet known.  Will handle when interface is signalled.
 addr="fe80::ecee:eeff:feee:eeee" exists=false ifIndex=29
2024-04-12 09:47:41.109 [INFO][44] felix/status_combiner.go 86: Reporting endpoint removed. id=proto.WorkloadEndpointID{OrchestratorId:"k8s", WorkloadId:"defa
ult/busybox5", EndpointId:"eth0"}
2024-04-12 09:47:41.109 [INFO][44] felix/int_dataplane.go 2011: Received interface update msg=&intdataplane.ifaceStateUpdate{Name:"cali0641058f83e", State:"do
wn", Index:29}
2024-04-12 09:47:41.109 [INFO][44] felix/int_dataplane.go 2031: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e",
 Addrs:set.Typed[string]{}}
2024-04-12 09:47:41.109 [INFO][44] felix/hostip_mgr.go 84: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e", Addrs:set.Ty
ped[string]{}}
2024-04-12 09:47:41.109 [INFO][44] felix/int_dataplane.go 2011: Received interface update msg=&intdataplane.ifaceStateUpdate{Name:"cali0641058f83e", State:"",
 Index:29}
2024-04-12 09:47:41.109 [INFO][44] felix/int_dataplane.go 2031: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e",
 Addrs:set.Set[string](nil)}
2024-04-12 09:47:41.109 [INFO][44] felix/hostip_mgr.go 84: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"cali0641058f83e", Addrs:set.Se
t[string](nil)}
2024-04-12 09:48:07.192 [INFO][44] felix/int_dataplane.go 1954: Received *proto.HostMetadataV4V6Update update from calculation graph msg=hostname:"k8s-node01"
 labels:<key:"beta.kubernetes.io/arch" value:"amd64" > labels:<key:"beta.kubernetes.io/instance-type" value:"rke2" > labels:<key:"beta.kubernetes.io/os" value
:"linux" > labels:<key:"kubernetes.io/arch" value:"amd64" > labels:<key:"kubernetes.io/hostname" value:"k8s-node01" > labels:<key:"kubernetes.io/os" value:"li
nux" > labels:<key:"node-role.kubernetes.io/control-plane" value:"true" > labels:<key:"node-role.kubernetes.io/etcd" value:"true" > labels:<key:"node-role.kub
ernetes.io/master" value:"true" > labels:<key:"node.kubernetes.io/instance-type" value:"rke2" >
kgts23 commented 5 months ago

I've changed the CNI from Canal to Calico. Now the metric-server is reachable from all Nodes.

root@k8s-node01:~# nc -uzv 192.168.1.21 4789
Connection to 192.168.1.21 4789 port [udp/*] succeeded!
root@k8s-node01:~# nc -uzv 192.168.1.22 4789
Connection to 192.168.1.22 4789 port [udp/*] succeeded!
root@k8s-node01:~# nc -uzv 192.168.1.23 4789
Connection to 192.168.1.23 4789 port [udp/*] succeeded!