flannel-io / flannel

flannel is a network fabric for containers, designed for Kubernetes
Apache License 2.0
8.6k stars 2.87k forks source link

k8s flannel network problem dial tcp 10.96.0.1:443: net/http: TLS handshake timeout #1999

Closed nickzibow closed 1 week ago

nickzibow commented 1 week ago

My Env is k8s cluster on openstack vms Now add a physical machine to join the k8s cluster and a flannel network problem occurs image image

rbrtbnfgl commented 1 week ago

It's trying to contact the Kubeapi server. Does the node have access to that IP?

luckydogxf commented 1 week ago

The node could access kube API.

root@uat-datascience-k8-22:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.16.205.1    0.0.0.0         UG    0      0        0 eno1
172.16.205.0    0.0.0.0         255.255.255.0   U     0      0        0 eno1

But flannel.1 cannot be created.

luckydogxf commented 1 week ago

Here is the log from normal worker node.

I0617 05:21:46.333833       1 main.go:211] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
W0617 05:21:46.334411       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0617 05:21:46.362261       1 kube.go:139] Waiting 10m0s for node controller to sync
I0617 05:21:46.362354       1 kube.go:469] Starting kube subnet manager
I0617 05:21:46.391198       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.0.0/24]
I0617 05:21:46.391393       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.7.0/24]
I0617 05:21:46.391412       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.10.0/24]
I0617 05:21:46.391425       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.11.0/24]
I0617 05:21:46.391439       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.12.0/24]
I0617 05:21:46.391458       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.13.0/24]
I0617 05:21:46.391471       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.15.0/24]
I0617 05:21:46.391511       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.17.0/24]
I0617 05:21:46.391528       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.19.0/24]
I0617 05:21:46.391546       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.18.0/24]
I0617 05:21:46.391564       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.16.0/24]
I0617 05:21:46.391576       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.1.0/24]
I0617 05:21:46.391591       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.14.0/24]
I0617 05:21:46.391633       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.20.0/24]
I0617 05:21:46.391656       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.2.0/24]
I0617 05:21:46.391668       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.3.0/24]
I0617 05:21:46.391680       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.8.0/24]
I0617 05:21:46.391695       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.4.0/24]
I0617 05:21:46.391707       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.5.0/24]
I0617 05:21:46.391720       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.9.0/24]
I0617 05:21:46.391736       1 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.244.6.0/24]
I0617 05:21:47.363287       1 kube.go:146] Node controller sync successful
I0617 05:21:47.363452       1 main.go:231] Created subnet manager: Kubernetes Subnet Manager - uat-datascience-k8-13
I0617 05:21:47.363461       1 main.go:234] Installing signal handlers
I0617 05:21:47.363835       1 main.go:452] Found network config - Backend type: vxlan
I0617 05:21:47.374506       1 kube.go:669] List of node(uat-datascience-k8-13) annotations: map[string]string{"csi.volume.kubernetes.io/nodeid":"{\"rook-ceph.cephfs.csi.ceph.com\":\"uat-datascience-k8-13\",\"rook-ceph.rbd.csi.ceph.com\":\"uat-datascience-k8-13\"}", "flannel.alpha.coreos.com/backend-data":"{\"VNI\":1,\"VtepMAC\":\"0a:e9:53:d2:3b:c8\"}", "flannel.alpha.coreos.com/backend-type":"vxlan", "flannel.alpha.coreos.com/kube-subnet-manager":"true", "flannel.alpha.coreos.com/public-ip":"172.16.205.66", "kubeadm.alpha.kubernetes.io/cri-socket":"unix:///var/run/containerd/containerd.sock", "node.alpha.kubernetes.io/ttl":"0", "volumes.kubernetes.io/controller-managed-attach-detach":"true"}
I0617 05:21:47.374794       1 match.go:210] Determining IP address of default interface
I0617 05:21:47.377608       1 match.go:263] Using interface with name ens3 and address 172.16.205.66
I0617 05:21:47.377668       1 match.go:285] Defaulting external address to interface address (172.16.205.66)
I0617 05:21:47.377764       1 vxlan.go:141] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
I0617 05:21:47.385198       1 kube.go:636] List of node(uat-datascience-k8-13) annotations: map[string]string{"csi.volume.kubernetes.io/nodeid":"{\"rook-ceph.cephfs.csi.ceph.com\":\"uat-datascience-k8-13\",\"rook-ceph.rbd.csi.ceph.com\":\"uat-datascience-k8-13\"}", "flannel.alpha.coreos.com/backend-data":"{\"VNI\":1,\"VtepMAC\":\"0a:e9:53:d2:3b:c8\"}", "flannel.alpha.coreos.com/backend-type":"vxlan", "flannel.alpha.coreos.com/kube-subnet-manager":"true", "flannel.alpha.coreos.com/public-ip":"172.16.205.66", "kubeadm.alpha.kubernetes.io/cri-socket":"unix:///var/run/containerd/containerd.sock", "node.alpha.kubernetes.io/ttl":"0", "volumes.kubernetes.io/controller-managed-attach-detach":"true"}
I0617 05:21:47.385303       1 vxlan.go:155] Interface flannel.1 mac address set to: 0a:e9:53:d2:3b:c8

As we can see flannel.1 is created properly. Therefore, the root cause is the failed node cannot access API due to tls handshake timeout.

luckydogxf commented 1 week ago

set mtu to 9000 as other nodes, problem was resolved. Thanks.