kubeedge join failed: RunPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded

mailliw2010 commented 2 months ago

What happened and what you expected to happen: What happened: keadm join --cloudcore-ipport=192.168.11.73:10000 --kubeedge-version=v1.18.0 -t 3f745d2d46cff9a3d77b23ed53aa0e1f5cb17d1618e275b95466297bce94fe54.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3MjI2NDgyNjN9.BPDmE_3yNuXkFmZqhu0NjVRUplNTcVNtmjqTCGmvWjw --v=9

I0806 07:21:24.320469 4703 clientconn.go:846] "[core] [ SubChannel #5]Subchannel created\n" I0806 07:21:24.320496 4703 logging.go:39] "[core] [Channel #4]Channel Connectivity change to CONNECTING\n" I0806 07:21:24.320516 4703 clientconn.go:305] "[core] [Channel #4]Channel exiting idle mode\n" I0806 07:21:24.320538 4703 remote_runtime.go:136] "Validating the CRI v1 API runtime version" I0806 07:21:24.320589 4703 logging.go:39] "[core] [ SubChannel #5]Subchannel Connectivity change to CONNECTING\n" I0806 07:21:24.320603 4703 logging.go:39] "[core] [ SubChannel #5]Subchannel picks a new address \"/run/containerd/containerd.sock\" to connect\n" I0806 07:21:24.320944 4703 pickfirst.go:166] "[core] [pick-first-lb 0xc000dd65d0] Received SubConn state update: 0xc000dd6630, {ConnectivityState:CONNECTING ConnectionError: I0806 07:21:24.321024 4703 logging.go:39] "[core] [ SubChannel #5]Subchannel Connectivity change to READY\n" I0806 07:21:24.321034 4703 pickfirst.go:166] "[core] [pick-first-lb 0xc000dd65d0] Received SubConn state update: 0xc000dd6630, {ConnectivityState:READY ConnectionError: I0806 07:21:24.321042 4703 logging.go:39] "[core] [Channel #4]Channel Connectivity change to READY\n" I0806 07:21:24.321956 4703 remote_runtime.go:143] "Validated CRI v1 runtime API" I0806 07:21:24.322034 4703 join_others.go:191] 4. Pull Images Pulling kubeedge/installation-package:v1.18.0 ... Successfully pulled kubeedge/installation-package:v1.18.0 I0806 07:21:31.836506 4703 join_others.go:191] 5. Copy resources from the image to the management directory E0806 07:21:51.837469 4703 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" Error: edge node join failed: copy resources failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded execute keadm command failed: edge node join failed: copy resources failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded

Expect happened: kubeedge node join in k8s success

How to reproduce it (as minimally and precisely as possible): always

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): v1.30 root@k8s-master01:~# kubectl get node NAME STATUS ROLES AGE VERSION k8s-master01 Ready control-plane 5d20h v1.30.0 root@k8s-master01:~# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-5b9b456c66-h7p5t 1/1 Running 7 (24h ago) 5d20h kube-system calico-node-wgfqf 1/1 Running 0 5d20h kube-system coredns-7b5944fdcf-c4sc5 1/1 Running 6 (24h ago) 5d20h kube-system coredns-7b5944fdcf-j49gg 1/1 Running 6 (24h ago) 5d20h kube-system etcd-k8s-master01 1/1 Running 1756 5d20h kube-system kube-apiserver-k8s-master01 1/1 Running 362 (24h ago) 5d20h kube-system kube-controller-manager-k8s-master01 1/1 Running 382 (6h31m ago) 5d20h kube-system kube-proxy-ldb9x 1/1 Running 0 5d20h kube-system kube-scheduler-k8s-master01 1/1 Running 376 (22h ago) 5d20h kubeedge cloudcore-54f4f45bb6-699qg 1/1 Running 6 (24h ago) 4d5h
KubeEdge version(e.g. cloudcore --version and edgecore --version): v1.18.0

run in ubuntu 20.04 vmware

Containerd v1.7.12 key config: SystemdCgroup = false sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.8" others keep default value

Shelley-BaoYue commented 2 months ago

You can find the issue out through the containerd log.

marcelomrwin commented 2 months ago

The same happened to me. Containerd logs didn't help much. Basically they repeat like this:

error="failed to handle sandbox TaskExit event: failed to stop sandbox: context deadline exceeded: unknown"

level=error msg="RemovePodSandbox failed" error="rpc error: code = DeadlineExceeded desc = failed to forcibly stop sandbox failed to stop sandbox container context deadline exceeded

Catherine-monk commented 1 month ago

Try modify the /etc/containerd/config.toml file. systemd_cgroup = false

kubeedge / kubeedge

kubeedge join failed: RunPodSandbox from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded #5782