kubeedge / kubeedge

Kubernetes Native Edge Computing Framework (project under CNCF)
https://kubeedge.io
Apache License 2.0
6.68k stars 1.71k forks source link

keadm join -> 'Error: edge node join failed: timed out waiting for the condition' #4681

Open Ymh13383894400 opened 1 year ago

Ymh13383894400 commented 1 year ago

What happened and what you expected to happen:

node join master when I used this command

keadm join --cloudcore-ipport=..... --token=....... --kubeedge-version=1.13.0

generated this error

I0317 18:44:57.215708  389267 command.go:845] 1. Check KubeEdge edgecore process status
I0317 18:44:57.343676  389267 command.go:845] 2. Check if the management directory is clean
I0317 18:44:57.344030  389267 join.go:107] 3. Create the necessary directories
I0317 18:44:57.370190  389267 join.go:184] 4. Pull Images
Pulling kubeedge/pause:3.6 ...
E0317 18:44:57.386884  389267 remote_image.go:160] "Get ImageStatus from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="kubeedge/pause:3.6"
Error: edge node join failed: pull Images failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService
execute keadm command failed:  edge node join failed: pull Images failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService

According to #4621 I used this command

keadm join -r docker --cloudcore-ipport=192.168...:10000 --token=........ --kubeedge-version=1.13.0

and

keadm join --runtimetype=docker --remote-runtime-endpoint="unix:///var/run/docker.sock" --cloudcore-ipport=192.168....:10000 --token=....... --kubeedge-version=1.13.0

both generated this error

I0317 20:07:12.248874  457194 command.go:845] 1. Check KubeEdge edgecore process status
I0317 20:07:12.371822  457194 command.go:845] 2. Check if the management directory is clean
I0317 20:07:12.374166  457194 join.go:107] 3. Create the necessary directories
I0317 20:07:12.383554  457194 join.go:184] 4. Pull Images
Pulling kubeedge/installation-package:v1.13.0 ...
Pulling eclipse-mosquitto:1.6.15 ...
Pulling kubeedge/pause:3.6 ...
I0317 20:07:12.427680  457194 join.go:184] 5. Copy resources from the image to the management directory
I0317 20:07:18.006075  457194 join.go:184] 6. Start the default mqtt service
I0317 20:07:25.282524  457194 join.go:107] 7. Generate systemd service file
I0317 20:07:25.283226  457194 join.go:107] 8. Generate EdgeCore default configuration
I0317 20:07:25.283587  457194 join.go:270] The configuration does not exist or the parsing fails, and the default configuration is generated
I0317 20:07:25.351698  457194 join.go:107] 9. Run EdgeCore daemon
I0317 20:07:45.345660  457194 join.go:431] 
I0317 20:07:45.345908  457194 join.go:432] KubeEdge edgecore is running, For logs visit: journalctl -u edgecore.service -xe
Error: edge node join failed: timed out waiting for the condition

Solution is whether or not I must install containerd,as docker's runtime ,or it is a question about docker's configuration,or it is a question about docker's version

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Shelley-BaoYue commented 1 year ago

When using keadm init install cloudcore, you need to use --profile version=v1.13.0 to specifies the version and you can execute journalctl -u edgecore.service -xe to see the log of edgecore.

Ymh13383894400 commented 1 year ago

When using keadm init install cloudcore, you need to use --profile version=v1.13.0 to specifies the version and you can execute journalctl -u edgecore.service -xe to see the log of edgecore.

# journalctl -u edgecore.service -xe
3月 20 11:51:29 node1 edgecore[173477]: I0320 11:51:29.470475  173477 server.go:103] Version: v1.13.0
3月 20 11:51:29 node1 edgecore[173477]: F0320 11:51:29.474259  173477 server.go:114] failed to check the running environment: kubelet should not running on edge node when runnin>
3月 20 11:51:29 node1 systemd[1]: edgecore.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ An ExecStart= process belonging to unit edgecore.service has exited.
░░ 
░░ The process' exit code is 'exited' and its exit status is 1.
3月 20 11:51:29 node1 systemd[1]: edgecore.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ The unit edgecore.service has entered the 'failed' state with result 'exit-code'.

Whether or not I install kubectl with an error?

Shelley-BaoYue commented 1 year ago

As the logs said, Kubelet should not running on edge node. You can try to execute systemctl set-environment CHECK_EDGECORE_ENVIRONMENT="false" to skip this check, but it may cause some problem if kubelet and edgecore both running on edgenode.

Ymh13383894400 commented 1 year ago

As the logs said, Kubelet should not running on edge node. You can try to execute systemctl set-environment CHECK_EDGECORE_ENVIRONMENT="false" to skip this check, but it may cause some problem if kubelet and edgecore both running on edgenode.

After I executed systemctl set-environment CHECK_EDGECORE_ENVIRONMENT="false" and keadm join ...

# journalctl -u edgecore.service -xe
3月 20 18:58:53 node1 edgecore[388417]: E0320 18:58:53.866847  388417 process.go:486] metamanager not supported operation: connect
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.871550  388417 client.go:153] finish hub-client pub
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.871627  388417 eventbus.go:71] Init Sub And Pub Client for external mqtt broker tcp://127.0.0.1:1883 successfully
3月 20 18:58:53 node1 edgecore[388417]: W0320 18:58:53.871683  388417 eventbus.go:168] Action not found
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.872128  388417 client.go:89] edge-hub-cli subscribe topic to $hw/events/device/+/state/update
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.882132  388417 client.go:89] edge-hub-cli subscribe topic to $hw/events/device/+/twin/+
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.886354  388417 client.go:89] edge-hub-cli subscribe topic to $hw/events/node/+/membership/get
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.896798  388417 client.go:89] edge-hub-cli subscribe topic to SYS/dis/upload_records
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.903197  388417 client.go:89] edge-hub-cli subscribe topic to +/user/#
3月 20 18:58:53 node1 edgecore[388417]: I0320 18:58:53.903718  388417 client.go:97] list edge-hub-cli-topics status, no record, skip sync
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.684237  388417 server.go:414] "No api server defined - no events will be sent to API server"
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.688799  388417 server.go:519] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.726662  388417 container_manager_linux.go:281] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.726850  388417 container_manager_linux.go:286] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/edged ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.726973  388417 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.738199  388417 container_manager_linux.go:321] "Creating device plugin manager" devicePluginEnabled=true
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.766958  388417 state_mem.go:36] "Initialized new in-memory state store"
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.767191  388417 kubelet.go:267] "Using dockershim is deprecated, please consider using a full-fledged CRI implementation"
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.767386  388417 client.go:99] "Start docker client with request timeout" timeout="2m0s"
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.950991  388417 docker_service.go:571] "Hairpin mode is set but kubenet is not enabled, falling back to HairpinVeth" hairpinMode=promiscuous-bridge
3月 20 18:58:55 node1 edgecore[388417]: I0320 18:58:55.951104  388417 docker_service.go:243] "Hairpin mode is set" hairpinMode=hairpin-veth
3月 20 18:58:56 node1 edgecore[388417]: I0320 18:58:56.199758  388417 docker_service.go:258] "Docker cri networking managed by the network plugin" networkPluginName="kubernetes.io/no-op"
3月 20 18:58:56 node1 edgecore[388417]: I0320 18:58:56.275734  388417 docker_service.go:264] "Docker Info" dockerInfo=&{ID:03220b34-0757-4157-80c2-eab7df63e46e Containers:13 ContainersRunning:5 ContainersPaused:0 ContainersStopped:8 Images:13 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Using metacopy false] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:false KernelMemoryTCP:false CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:53 OomKillDisable:false NGoroutines:55 SystemTime:2023-03-20T18:58:56.201770801+08:00 LoggingDriver:json-file CgroupDriver:systemd CgroupVersion:2 NEventsListener:0 KernelVersion:5.19.0-35-generic OperatingSystem:Ubuntu 22.04 LTS OSVersion:22.04 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0002f1420 NCPU:2 MemTotal:4077703168 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:node1 Labels:[] ExperimentalBuild:false ServerVersion:23.0.1 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:<nil>} runc:{Path:runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:2456e983eb9e37e47538f59ea18f2043c9a73640 Expected:2456e983eb9e37e47538f59ea18f2043c9a73640} RuncCommit:{ID:v1.1.4-0-g5fd4c4d Expected:v1.1.4-0-g5fd4c4d} InitCommit:{ID:de40ad0 Expected:de40ad0} SecurityOptions:[name=apparmor name=seccomp,profile=builtin name=cgroupns] ProductLicense: DefaultAddressPools:[] Warnings:[]}
3月 20 18:58:56 node1 edgecore[388417]: E0320 18:58:56.275985  388417 edged.go:124] Start edged failed, err: failed to run Kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
3月 20 18:58:56 node1 systemd[1]: edgecore.service: Main process exited, code=exited, status=1/FAILURE
3月 20 18:58:56 node1 systemd[1]: edgecore.service: Failed with result 'exit-code'.

It occurred an error,thought systemctl set-environment CHECK_EDGECORE_ENVIRONMENT="false skip this check.

Start edged failed, err: failed to run Kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
# cat /etc/docker/daemon.conf
{
 "registry-mirrors": ["https://lzyjejsb.mirror.aliyuncs.com"],
 "exec-opts": ["native.cgroupdriver=systemd"]
}
# cat /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging:
  flushFrequency: 0
  options:
    json:
      infoBufferSize: "0"
  verbosity: 0
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
resolvConf: /run/systemd/resolve/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
Shelley-BaoYue commented 1 year ago

Now you are installing edgecore module of KubeEdge, so you need to check edgecore config which is /etc/kubeedge/config/edgecore.yaml Instead of /var/lib/kubelet/config.yaml what you list. If you want to install KubeEdge, you don't need to install Kubelet.

Ymh13383894400 commented 1 year ago

Now you are installing edgecore module of KubeEdge, so you need to check edgecore config which is /etc/kubeedge/config/edgecore.yaml Instead of /var/lib/kubelet/config.yaml what you list. If you want to install KubeEdge, you don't need to install Kubelet.

# journalctl -u edgecore.service -f
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.470513   18912 core.go:46] starting module websocket
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.470627   18912 core.go:46] starting module eventbus
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.470761   18912 core.go:46] starting module metamanager
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.470873   18912 core.go:46] starting module twin
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.478367   18912 certmanager.go:161] Certificate rotation is enabled.
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.478593   18912 websocket.go:51] Websocket start to connect Access
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.480902   18912 edged.go:102] Starting edged...
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.481510   18912 common.go:97] start connect to mqtt server with client id: hub-client-sub-1679482985
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.482294   18912 common.go:99] client hub-client-sub-1679482985 isconnected: false
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.483513   18912 process.go:117] Begin to sync sqlite
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.485368   18912 dmiworker.go:67] dmi worker start
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.485980   18912 dmiworker.go:215] success to init device model info from db
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.489290   18912 server.go:397] "Kubelet version" kubeletVersion="v0.0.0-master+$Format:%H$"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.490880   18912 dmiworker.go:235] success to init device info from db
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.491255   18912 dmiworker.go:255] success to init device mapper info from db
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.492664   18912 server.go:183] init uds socket: /etc/kubeedge/dmi.sock
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.497846   18912 client.go:134] finish hub-client sub
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.497992   18912 common.go:97] start connect to mqtt server with client id: hub-client-pub-1679482985
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.498051   18912 common.go:99] client hub-client-pub-1679482985 isconnected: false
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.501929   18912 client.go:89] edge-hub-cli subscribe topic to $hw/events/upload/#
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.503815   18912 client.go:153] finish hub-client pub
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.503954   18912 eventbus.go:71] Init Sub And Pub Client for external mqtt broker tcp://127.0.0.1:1883 successfully
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.506024   18912 client.go:89] edge-hub-cli subscribe topic to $hw/events/device/+/state/update
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.506375   18912 ws.go:46] dial wss://192.168.100.100:10000/e632aba927ea4ac2b575ec1603d56f10/node/events successfully
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.506858   18912 websocket.go:93] Websocket connect to cloud access successful
3月 22 19:03:05 node edgecore[18912]: W0322 19:03:05.507180   18912 eventbus.go:168] Action not found
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.507656   18912 process.go:299] DeviceTwin receive msg
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.507839   18912 process.go:68] Send msg to the CommModule module in twin
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.508382   18912 process.go:461] node connection event occur: cloud_connected
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.509972   18912 process.go:461] node connection event occur: cloud_connected
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.510414   18912 client.go:89] edge-hub-cli subscribe topic to $hw/events/device/+/twin/+
3月 22 19:03:05 node edgecore[18912]: W0322 19:03:05.510814   18912 manager.go:159] Cannot detect current cgroup on cgroup v2
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.515569   18912 client.go:89] edge-hub-cli subscribe topic to $hw/events/node/+/membership/get
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.517435   18912 client.go:89] edge-hub-cli subscribe topic to SYS/dis/upload_records
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.534475   18912 client.go:89] edge-hub-cli subscribe topic to +/user/#
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.536259   18912 client.go:97] list edge-hub-cli-topics status, no record, skip sync
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823139   18912 server.go:445] "No api server defined - no events will be sent to API server"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823186   18912 server.go:612] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823598   18912 container_manager_linux.go:280] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823670   18912 container_manager_linux.go:285] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823728   18912 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823748   18912 container_manager_linux.go:320] "Creating device plugin manager" devicePluginEnabled=true
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823804   18912 state_mem.go:36] "Initialized new in-memory state store"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823906   18912 kubelet.go:268] "Using dockershim is deprecated, please consider using a full-fledged CRI implementation"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823955   18912 client.go:78] "Connecting to docker on the dockerEndpoint" endpoint="unix:///var/run/docker.sock"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.823976   18912 client.go:97] "Start docker client with request timeout" timeout="2m0s"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.861742   18912 docker_service.go:570] "Hairpin mode is set but kubenet is not enabled, falling back to HairpinVeth" hairpinMode=promiscuous-bridge
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.861937   18912 docker_service.go:242] "Hairpin mode is set" hairpinMode=hairpin-veth
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.862099   18912 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.876481   18912 hostport_manager.go:72] "The binary conntrack is not installed, this can cause failures in network connection cleanup."
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.876745   18912 hostport_manager.go:72] "The binary conntrack is not installed, this can cause failures in network connection cleanup."
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.891154   18912 docker_service.go:257] "Docker cri networking managed by the network plugin" networkPluginName="kubernetes.io/no-op"
3月 22 19:03:05 node edgecore[18912]: I0322 19:03:05.930379   18912 docker_service.go:263] "Docker Info" dockerInfo=&{ID:053dd63d-61ee-4c8d-aaea-6df2ecfe7028 Containers:1 ContainersRunning:1 ContainersPaused:0 ContainersStopped:0 Images:3 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Using metacopy false] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:false KernelMemoryTCP:false CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:30 OomKillDisable:false NGoroutines:40 SystemTime:2023-03-22T19:03:05.892013132+08:00 LoggingDriver:json-file CgroupDriver:systemd CgroupVersion:2 NEventsListener:0 KernelVersion:5.15.0-25-generic OperatingSystem:Ubuntu 22.04 LTS OSVersion:22.04 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0008e81c0 NCPU:4 MemTotal:4080095232 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:node Labels:[] ExperimentalBuild:false ServerVersion:23.0.1 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:<nil>} runc:{Path:runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:2456e983eb9e37e47538f59ea18f2043c9a73640 Expected:2456e983eb9e37e47538f59ea18f2043c9a73640} RuncCommit:{ID:v1.1.4-0-g5fd4c4d Expected:v1.1.4-0-g5fd4c4d} InitCommit:{ID:de40ad0 Expected:de40ad0} SecurityOptions:[name=apparmor name=seccomp,profile=builtin name=cgroupns] ProductLicense: DefaultAddressPools:[] Warnings:[]}
3月 22 19:03:05 node edgecore[18912]: E0322 19:03:05.930541   18912 edged.go:107] Start edged failed, err: failed to run Kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
3月 22 19:03:05 node systemd[1]: edgecore.service: Main process exited, code=exited, status=1/FAILURE
3月 22 19:03:05 node systemd[1]: edgecore.service: Failed with result 'exit-code'.

I reinstall Docker, and execute keadm join -r docker token=.....。There is still an error

Start edged failed, err: failed to run Kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
3月 22 19:03:05 node systemd[1]: edgecore.service: Main process exited, code=exited, status=1/FAILURE
# cat /etc/docker/daemon.josn
{ 
 "registry-mirrors": [ "https://dockerproxy.com" ],
 "exec-opts": ["native.cgroupdriver=systemd"]
 }
Shelley-BaoYue commented 1 year ago

edgecore uses cgroupfs as default while your docker uses systemd. You can edit cgroupDriver in /etc/kubeedge/config/edgecore.yaml and restart edgecore. Or you can set --cgroupdriver=systemd when you use keadm join to join edgenode.

reckless-huang commented 1 year ago

[root@vm03 ~]# docker version Client: Docker Engine - Community Version: 23.0.2 API version: 1.42 Go version: go1.19.7 Git commit: 569dd73 Built: Mon Mar 27 16:18:54 2023 OS/Arch: linux/amd64 Context: default

Server: Docker Engine - Community Engine: Version: 23.0.2 API version: 1.42 (minimum version 1.12) Go version: go1.19.7 Git commit: 219f21b Built: Mon Mar 27 16:16:31 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.19 GitCommit: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f runc: Version: 1.1.4 GitCommit: v1.1.4-0-g5fd4c4d docker-init: Version: 0.19.0 GitCommit: de40ad0

keadm join --cloudcore-ipport=$kepoint --token=$ketoken --kubeedge-version=v1.13.0 ERROR remote_image.go:160] "Get ImageStatus from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"

Shelley-BaoYue commented 1 year ago

@reckless-huang KubeEdge v1.13 uses containerd as default runtime. If you use docker, you need set --runtimetype=docker when executing keadm join.

reckless-huang commented 1 year ago

--runtimetype=docker Solved my problem thanks

CoffeeHi commented 1 year ago
~# keadm join --cloudcore-ipport="kubeedge.xxx.com":10000 --token=$(keadm gettoken) --cgroupdriver systemd --runtimetype=docker --remote-runtime-endpoint="unix:///var/run/docker.sock"   --kubeedge-version=v1.14.0
I0705 10:17:42.389317 2035202 command.go:845] 1. Check KubeEdge edgecore process status
I0705 10:17:42.397521 2035202 command.go:845] 2. Check if the management directory is clean
I0705 10:17:42.397589 2035202 join.go:107] 3. Create the necessary directories
I0705 10:17:42.398467 2035202 join.go:184] 4. Pull Images
Pulling eclipse-mosquitto:1.6.15 ...
Pulling kubeedge/installation-package:v1.14.0 ...
Pulling kubeedge/pause:3.6 ...
I0705 10:17:42.410238 2035202 join.go:184] 5. Copy resources from the image to the management directory
I0705 10:17:43.256522 2035202 join.go:184] 6. Start the default mqtt service
I0705 10:17:43.654119 2035202 join.go:107] 7. Generate systemd service file
I0705 10:17:43.654324 2035202 join.go:107] 8. Generate EdgeCore default configuration
I0705 10:17:43.654368 2035202 join.go:270] The configuration does not exist or the parsing fails, and the default configuration is generated
I0705 10:17:43.660401 2035202 join.go:107] 9. Run EdgeCore daemon
I0705 10:17:44.705122 2035202 join.go:431] 
I0705 10:17:44.705143 2035202 join.go:432] KubeEdge edgecore is running, For logs visit: journalctl -u edgecore.service -xe

Hi @Shelley-BaoYue , I use latest 1.4.0 keadm version to join, but always block here, I found the log below, any suggestions ?

Kubernetes version (use 1.23.5):
KubeEdge version( cloudcore 1.13.0 using helm chart installation):
docker version (20.10.2)
Jul 05 10:17:44 ip-172-31-54-195 edgecore[2035457]: E0705 10:17:44.798896 2035457 edged.go:125] Start edged failed, err: failed to run Kubelet: unsupported CRI runtime: "docker", only "remote" is currently supported
Shelley-BaoYue commented 1 year ago

@CoffeeHi Docker is not supported from KubeEdge v1.14 because Kubernetes dependency version has been upgraded to 1.24.. If you still want to use docker, you need to install cri-dockerd and use --remote-runtime-endpoint=unix:///var/run/cri-dockerd.sock ref: https://github.com/Mirantis/cri-dockerd & https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/migrate-dockershim-dockerd/

xiaocx0110 commented 1 month ago
~# keadm join --cloudcore-ipport="kubeedge.xxx.com":10000 --token=$(keadm gettoken) --cgroupdriver systemd --runtimetype=docker --remote-runtime-endpoint="unix:///var/run/docker.sock"   --kubeedge-version=v1.14.0
I0705 10:17:42.389317 2035202 command.go:845] 1. Check KubeEdge edgecore process status
I0705 10:17:42.397521 2035202 command.go:845] 2. Check if the management directory is clean
I0705 10:17:42.397589 2035202 join.go:107] 3. Create the necessary directories
I0705 10:17:42.398467 2035202 join.go:184] 4. Pull Images
Pulling eclipse-mosquitto:1.6.15 ...
Pulling kubeedge/installation-package:v1.14.0 ...
Pulling kubeedge/pause:3.6 ...
I0705 10:17:42.410238 2035202 join.go:184] 5. Copy resources from the image to the management directory
I0705 10:17:43.256522 2035202 join.go:184] 6. Start the default mqtt service
I0705 10:17:43.654119 2035202 join.go:107] 7. Generate systemd service file
I0705 10:17:43.654324 2035202 join.go:107] 8. Generate EdgeCore default configuration
I0705 10:17:43.654368 2035202 join.go:270] The configuration does not exist or the parsing fails, and the default configuration is generated
I0705 10:17:43.660401 2035202 join.go:107] 9. Run EdgeCore daemon
I0705 10:17:44.705122 2035202 join.go:431] 
I0705 10:17:44.705143 2035202 join.go:432] KubeEdge edgecore is running, For logs visit: journalctl -u edgecore.service -xe

Hi @Shelley-BaoYue , I use latest 1.4.0 keadm version to join, but always block here, I found the log below, any suggestions ?

Kubernetes version (use 1.23.5):
KubeEdge version( cloudcore 1.13.0 using helm chart installation):
docker version (20.10.2)
Jul 05 10:17:44 ip-172-31-54-195 edgecore[2035457]: E0705 10:17:44.798896 2035457 edged.go:125] Start edged failed, err: failed to run Kubelet: unsupported CRI runtime: "docker", only "remote" is currently supported

I encountered the same issue. Have you managed to resolve it? I used the following command: ./keadm join --cloudcore-ipport=1*:10000 --token=70f719534a189ddaef1a**DiLVbFg_38UtqW4 --kubeedge-version=1.14.0 --runtimetype=docker --remote-runtime-endpoint="unix:///var/run/cri-dockerd.sock". The error is : 15:29:33.097150 140353 edged.go:125] Start edged failed, err: failed to run Kubelet: unsupported CRI runtime: "docker", only "remote" is curre>

Shelley-BaoYue commented 1 month ago

@xiaocx0110 https://kubeedge.io/docs/setup/prerequisites/runtime#docker-engine