kubeedge / kubeedge

Kubernetes Native Edge Computing Framework (project under CNCF)
https://kubeedge.io
Apache License 2.0
6.71k stars 1.72k forks source link

Flannel pod automatically deployed after edgecore is joined to cloudcore using keadm immediatly fails #3691

Open inadquatecoding opened 2 years ago

inadquatecoding commented 2 years ago

What happened and what you expected to happen: joined my edgenode to my cloudcore but after the edgenode joins to the cloudcore it automatically trys to deploy a flannel container which subsequently fails.

Expectation is that a flannel pod that is automatically deployed within the edgenode after joining to the cloud core would also automagically work.

How to reproduce it (as minimally and precisely as possible): on Core:

-local ip for core node is 10.43.6.20 (same as my ens19 interface) -kubeadm init --pod-network-cidr=10.50.0.0/24 -kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml -keadm init -keadm getttoken

edge node: keadm join --cloudcore-ipport=10.43.6.20:10000 --token=

Anything else we need to know?:

NAME            STATUS   ROLES                  AGE   VERSION                    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
group8central   Ready    control-plane,master   78m   v1.21.0                    10.43.6.20    <none>        Ubuntu 20.04.4 LTS   5.13.0-35-generic   docker://20.10.7

group8edge      Ready    agent,edge             17m   v1.22.6-kubeedge-v1.10.0   10.43.6.26    <none>        Ubuntu 20.04.3 LTS   5.13.0-35-generic   docker://20.10.7

Running/Deployed pods:

$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                    READY   STATUS    RESTARTS   AGE    IP           NODE            NOMINATED NODE   READINESS GATES
kube-system   coredns-558bd4d5db-96k9x                1/1     Running   0          107m   10.50.0.2    group8central   <none>           <none>
kube-system   coredns-558bd4d5db-pvwzb                1/1     Running   0          107m   10.50.0.3    group8central   <none>           <none>
kube-system   etcd-group8central                      1/1     Running   0          108m   10.43.6.20   group8central   <none>           <none>
kube-system   kube-apiserver-group8central            1/1     Running   0          108m   10.43.6.20   group8central   <none>           <none>
kube-system   kube-controller-manager-group8central   1/1     Running   0          108m   10.43.6.20   group8central   <none>           <none>
kube-system   kube-flannel-ds-d7ppm                   1/1     Running   0          96m    10.43.6.20   group8central   <none>           <none>
**kube-system   kube-flannel-ds-mh5ld                   0/1     Error     13         47m    10.43.6.26   group8edge      <none>           <none>
kube-system   kube-proxy-b9m9d                        1/1     Running   0          47m    10.43.6.26   group8edge      <none>           <none>**
kube-system   kube-proxy-tpx4f                        1/1     Running   0          107m   10.43.6.20   group8central   <none>           <none>
kube-system   kube-scheduler-group8central            1/1     Running   0          108m   10.43.6.20   group8central   <none>           <none>

Edge core logs (the rest is attached) journalctl --unit edgecore.service kubeedgeEdgeNode.log

Mar 11 22:24:41 group8edge edgecore[4533]: I0311 22:24:41.123742    4533 edged.go:962] worker [2] backoff pod addition item [kube-flannel-ds-mh5ld] failed, re-add to queue
Mar 11 22:24:41 group8edge edgecore[4533]: I0311 22:24:41.169153    4533 edged.go:957] worker [4] get pod addition item [kube-flannel-ds-mh5ld]
Mar 11 22:24:41 group8edge edgecore[4533]: E0311 22:24:41.169184    4533 edged.go:960] consume pod addition backoff: Back-off consume pod [kube-flannel-ds-mh5ld] addition  error, backoff: [2m40s]
Mar 11 22:24:41 group8edge edgecore[4533]: I0311 22:24:41.169212    4533 edged.go:962] worker [4] backoff pod addition item [kube-flannel-ds-mh5ld] failed, re-add to queue
Mar 11 22:24:42 group8edge edgecore[4533]: I0311 22:24:42.177273    4533 edged.go:957] worker [0] get pod addition item [kube-flannel-ds-mh5ld]
Mar 11 22:24:42 group8edge edgecore[4533]: E0311 22:24:42.177342    4533 edged.go:960] consume pod addition backoff: Back-off consume pod [kube-flannel-ds-mh5ld] addition  error, backoff: [2m40s]
Mar 11 22:24:42 group8edge edgecore[4533]: I0311 22:24:42.177373    4533 edged.go:957] worker [0] get pod addition item [kube-flannel-ds-mh5ld]
Mar 11 22:24:42 group8edge edgecore[4533]: E0311 22:24:42.177395    4533 edged.go:960] consume pod addition backoff: Back-off consume pod [kube-flannel-ds-mh5ld] addition  error, backoff: [2m40s]
Mar 11 22:24:42 group8edge edgecore[4533]: I0311 22:24:42.177430    4533 edged.go:962] worker [0] backoff pod addition item [kube-flannel-ds-mh5ld] failed, re-add to queue
Mar 11 22:24:42 group8edge edgecore[4533]: I0311 22:24:42.177458    4533 edged.go:962] worker [0] backoff pod addition item [kube-flannel-ds-mh5ld] failed, re-add to queue

Additional Information:

$ kubectl describe pods kube-flannel-ds-mh5ld --namespace=kube-system
Name:                 kube-flannel-ds-mh5ld
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 group8edge/10.43.6.26
Start Time:           Fri, 11 Mar 2022 21:51:05 -0500
Labels:               app=flannel
                      controller-revision-hash=fff6d8f96
                      pod-template-generation=1
                      tier=node
Annotations:          <none>
Status:               Running
IP:                   10.43.6.26
IPs:
  IP:           10.43.6.26
Controlled By:  DaemonSet/kube-flannel-ds
Init Containers:
  install-cni-plugin:
    Container ID:  docker://a1dbea7ad595b05ce1c61447b627cf403c66f99ef06eccd3ce3e7b1063e2d4cf
    Image:         rancher/mirrored-flannelcni-flannel-cni-plugin:v1.0.1
    Image ID:      docker-pullable://rancher/mirrored-flannelcni-flannel-cni-plugin@sha256:5dd61f95e28fa7ef897ff2fa402ce283e5078d334401d2f62d00a568f779f2d5
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
    Args:
      -f
      /flannel
      /opt/cni/bin/flannel
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 11 Mar 2022 21:51:09 -0500
      Finished:     Fri, 11 Mar 2022 21:51:09 -0500
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /opt/cni/bin from cni-plugin (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r8cz9 (ro)
  install-cni:
    Container ID:  docker://c9e378c3db350c4b6ff8f35d42bae91bd017b21b87fc80366780842ceb7061b4
    Image:         rancher/mirrored-flannelcni-flannel:v0.17.0
    Image ID:      docker-pullable://rancher/mirrored-flannelcni-flannel@sha256:4bf659e449be809763b04f894f53a3d8610e00cf2cd979bb4fffc9470eb40d1b
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
    Args:
      -f
      /etc/kube-flannel/cni-conf.json
      /etc/cni/net.d/10-flannel.conflist
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 11 Mar 2022 21:51:13 -0500
      Finished:     Fri, 11 Mar 2022 21:51:13 -0500
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/cni/net.d from cni (rw)
      /etc/kube-flannel/ from flannel-cfg (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r8cz9 (ro)
Containers:
  kube-flannel:
    Container ID:  docker://2f03f7b6c0d31fb8e281587ffa12a8896713ae744e33055bd7364f8cf1f86af0
    Image:         rancher/mirrored-flannelcni-flannel:v0.17.0
    Image ID:      docker-pullable://rancher/mirrored-flannelcni-flannel@sha256:4bf659e449be809763b04f894f53a3d8610e00cf2cd979bb4fffc9470eb40d1b
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/bin/flanneld
    Args:
      --ip-masq
      --kube-subnet-mgr
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 11 Mar 2022 22:32:29 -0500
      Finished:     Fri, 11 Mar 2022 22:32:30 -0500
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 11 Mar 2022 22:32:29 -0500
      Finished:     Fri, 11 Mar 2022 22:32:30 -0500
    Ready:          False
    Restart Count:  12
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_NAME:       kube-flannel-ds-mh5ld (v1:metadata.name)
      POD_NAMESPACE:  kube-system (v1:metadata.namespace)
    Mounts:
      /etc/kube-flannel/ from flannel-cfg (rw)
      /run/flannel from run (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r8cz9 (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  run:
    Type:          HostPath (bare host directory volume)
    Path:          /run/flannel
    HostPathType:  
  cni-plugin:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:  
  cni:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  
  flannel-cfg:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kube-flannel-cfg
    Optional:  false
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  kube-api-access-r8cz9:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 :NoSchedule op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  43m   default-scheduler  Successfully assigned kube-system/kube-flannel-ds-mh5ld to group8edge

EdgeNode Description:

$ kubectl describe node group8edge
Name:               group8edge
Roles:              agent,edge
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=group8edge
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/agent=
                    node-role.kubernetes.io/edge=
Annotations:        node.alpha.kubernetes.io/ttl: 0
CreationTimestamp:  Fri, 11 Mar 2022 21:50:55 -0500
Taints:             <none>
Unschedulable:      false
Lease:              Failed to get lease: leases.coordination.k8s.io "group8edge" not found
Conditions:
  Type    Status  LastHeartbeatTime                 LastTransitionTime                Reason      Message
  ----    ------  -----------------                 ------------------                ------      -------
  Ready   True    Fri, 11 Mar 2022 22:36:13 -0500   Fri, 11 Mar 2022 21:50:55 -0500   EdgeReady   edge is posting ready status
Addresses:
  InternalIP:  10.43.6.26
  Hostname:    group8edge
Capacity:
  cpu:                8
  ephemeral-storage:  102168536Ki
  memory:             119963Mi
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  101119960Ki
  memory:             119863Mi
  pods:               110
System Info:
  Machine ID:                 
  System UUID:                
  Boot ID:                    
  Kernel Version:             5.13.0-35-generic
  OS Image:                   Ubuntu 20.04.3 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://20.10.7
  Kubelet Version:            v1.22.6-kubeedge-v1.10.0
  Kube-Proxy Version:         
Non-terminated Pods:          (2 in total)
  Namespace                   Name                     CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                     ------------  ----------  ---------------  -------------  ---
  kube-system                 kube-flannel-ds-mh5ld    100m (1%)     100m (1%)   50Mi (0%)        50Mi (0%)      45m
  kube-system                 kube-proxy-b9m9d         0 (0%)        0 (0%)      0 (0%)           0 (0%)         45m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (1%)  100m (1%)
  memory             50Mi (0%)  50Mi (0%)
  ephemeral-storage  0 (0%)     0 (0%)
Events:
  Type    Reason    Age   From        Message
  ----    ------    ----  ----        -------
  Normal  Starting  44m   kube-proxy  Starting kube-proxy.

Environment:

zxyy-bys commented 2 years ago

@inadquatecoding You can modify the DaemonSet.spec.template.spec.nodeAffinity of the flannel yaml file to avoid scheduling the flannel on your edge node, and use edgeMesh instead.

albacanete commented 1 year ago

Isn't there a solution to deploy flannel in kubeedge nodes?

shawn-robusoft commented 10 months ago

how to solve the problem?