Open ljjzhka opened 2 years ago
The device that joins the cluster subsequently cannot be found. The log is as follows:
[root@master ~]# journalctl -xef | grep cloudcore | grep twowin Aug 29 17:31:06 master cloudcore[4390]: I0829 17:31:06.390911 4390 upstream.go:428] node: twowin already exists, do nothing Aug 29 17:33:20 master cloudcore[4390]: E0829 17:33:20.672483 4390 messagehandler.go:471] Failed to send event to node: twowin, affected event: id: 932c5a11-6aea-4c9d-abcb-f05a95c33094, parent_id: , group: resource, source: edgecontroller, resource: aiot-test/pod/remote-connect-25-86d9dfb85-8tk45, operation: update, err: use of closed network connection Aug 29 17:34:20 master cloudcore[4390]: I0829 17:34:20.994217 4390 synccontroller.go:148] ObjectSync twowin.cffc2a7d-c421-4b1b-8b4b-b021cb7e4d7a will be deleted since node twowin has been deleted Aug 29 17:34:20 master cloudcore[4390]: W0829 17:34:20.999169 4390 messagehandler.go:394] node twowin is deleted, data for node will be cleaned up Aug 29 17:34:21 master cloudcore[4390]: E0829 17:34:20.999320 4390 messagehandler.go:447] nodeQueue for node twowin has shutdown Aug 29 17:34:21 master cloudcore[4390]: W0829 17:34:20.999348 4390 messagehandler.go:201] Stop keepalive check for node: twowin Aug 29 17:34:21 master cloudcore[4390]: I0829 17:34:21.000749 4390 synccontroller.go:148] ObjectSync twowin.548b37fc-592b-49ce-be2d-99bcc6a4bbe4 will be deleted since node twowin has been deleted Aug 29 17:34:21 master cloudcore[4390]: I0829 17:34:21.003794 4390 synccontroller.go:148] ObjectSync twowin.52ac182c-af69-4f4d-8ade-b3faf6a598c9 will be deleted since node twowin has been deleted Aug 29 17:34:21 master cloudcore[4390]: I0829 17:34:21.007568 4390 synccontroller.go:148] ObjectSync twowin.7534f791-2a25-43e8-bdd7-77e89345b6a0 will be deleted since node twowin has been deleted Aug 29 17:34:27 master cloudcore[4390]: W0829 17:34:27.482498 4390 upstream.go:468] message: fdd3b922-5174-4883-94ca-47cd46595d10 process failure, node twowin not found Aug 29 17:34:32 master cloudcore[4390]: E0829 17:34:32.088579 4390 messagehandler.go:126] Failed to load node : twowin Aug 29 17:34:37 master cloudcore[4390]: W0829 17:34:37.525128 4390 upstream.go:468] message: 266c4007-ab39-40ee-ab2b-72d0d44b14ac process failure, node twowin not found Aug 29 17:34:47 master cloudcore[4390]: E0829 17:34:47.089535 4390 messagehandler.go:126] Failed to load node : twowin Aug 29 17:34:47 master cloudcore[4390]: W0829 17:34:47.565485 4390 upstream.go:468] message: a15d3b43-2898-4bf6-a418-f67f17ee955a process failure, node twowin not found Aug 29 17:34:57 master cloudcore[4390]: I0829 17:34:57.593978 4390 messagehandler.go:304] edge node twowin for project e632aba927ea4ac2b575ec1603d56f10 connected Aug 29 17:34:57 master cloudcore[4390]: W0829 17:34:57.603331 4390 upstream.go:468] message: 3f156101-4430-4b0f-adde-b569fc9c9063 process failure, node twowin not found Aug 29 17:34:57 master cloudcore[4390]: W0829 17:34:57.604671 4390 upstream.go:468] message: f19819a8-cfa2-4ad4-b8fe-923fd8ffaafe process failure, node twowin not found Aug 29 17:34:57 master cloudcore[4390]: W0829 17:34:57.612764 4390 upstream.go:468] message: 06bf976a-31ac-4ad5-9ac6-5e4218c502d0 process failure, node twowin not found Aug 29 17:34:57 master cloudcore[4390]: E0829 17:34:57.625866 4390 messagehandler.go:602] Delete Success Point failed with error: objectsyncs.reliablesyncs.kubeedge.io "twowin." not found Aug 29 17:34:57 master cloudcore[4390]: W0829 17:34:57.721322 4390 upstream.go:468] message: 832ec386-20ef-48bc-b1a7-8bdcbf873709 process failure, node twowin not found Aug 29 17:35:07 master cloudcore[4390]: W0829 17:35:07.674800 4390 upstream.go:468] message: 78563fb9-6bde-497f-bc3b-dbc1c0118f1c process failure, node twowin not found Aug 29 17:35:07 master cloudcore[4390]: W0829 17:35:07.778154 4390 upstream.go:468] message: 3d70bc05-76e7-4ef3-90d3-bf5670de3136 process failure, node twowin not found Aug 29 17:35:17 master cloudcore[4390]: W0829 17:35:17.702964 4390 upstream.go:468] message: f5f782d3-bbca-40d5-af0b-70650e64771d process failure, node twowin not found Aug 29 17:35:17 master cloudcore[4390]: W0829 17:35:17.840325 4390 upstream.go:468] message: e3c41f21-3403-4808-8e47-a78e830f1935 process failure, node twowin not found Aug 29 17:35:27 master cloudcore[4390]: W0829 17:35:27.749316 4390 upstream.go:468] message: dd8db745-f1b9-4896-8a17-cc563e9e7003 process failure, node twowin not found Aug 29 17:35:27 master cloudcore[4390]: W0829 17:35:27.899450 4390 upstream.go:468] message: aa421d63-b292-48db-a1be-822e44c94cce process failure, node twowin not found Aug 29 17:35:37 master cloudcore[4390]: W0829 17:35:37.813122 4390 upstream.go:468] message: e90da3ef-a03b-45f1-8dc1-2329c7452a60 process failure, node twowin not found Aug 29 17:35:37 master cloudcore[4390]: W0829 17:35:37.958092 4390 upstream.go:468] message: d23b339b-f012-44bc-b65b-41e604a89cbd process failure, node twowin not found Aug 29 17:35:48 master cloudcore[4390]: W0829 17:35:48.017624 4390 upstream.go:468] message: d00a2d7b-629d-49ac-98e0-805de91047c5 process failure, node twowin not found Aug 29 17:35:58 master cloudcore[4390]: W0829 17:35:58.077751 4390 upstream.go:468] message: 3c25320b-d254-46a8-8af9-6ecd8fd236ca process failure, node twowin not found Aug 29 17:36:04 master cloudcore[4390]: W0829 17:36:04.596660 4390 upstream.go:764] message: f88a0c46-eb5c-4cd3-84e1-ee6e9db6a8e0 process failure, node twowin not found Aug 29 17:36:11 master cloudcore[4390]: E0829 17:36:11.795240 4390 messagehandler.go:471] Failed to send event to node: twowin, affected event: id: 0d3b8d01-9a02-4276-8d6a-8f6fcf4fd87e, parent_id: , group: resource, source: edgecontroller, resource: aiot-test/pod/video-26-7dd8b74ddd-c4d6z, operation: update, err: use of closed network connection
Hi, @LeiJJ8 can you provide the version of cloudcore
and edgecore
?
Hi,@vincentgoat The version I am using is 1.10.0.
I think it is because of the type assert panic, more details show below.
Fortunately, we can fix this issue after this PR https://github.com/kubeedge/kubeedge/pull/4105 is merged.
@vincentgoat Excuse me, Has it been merged in the latest version? For example 1.10.3/1.11.2
This PR https://github.com/kubeedge/kubeedge/pull/4105 was just merged in the master branch and released in version v1.12.0. Is this problem reproduced frequently? We highly appreciate it if you work on it as well.
@vincentgoat Often reproduced in 1.10.1, I will be ready to test the latest version.
The detailed log is as follows:
I0829 16:40:18.216591 4346 upstream.go:89] Dispatch message: a0a31574-3922-4820-b2fc-7dc65a3dd3e1 I0829 16:40:18.216596 4346 upstream.go:96] Message: a0a31574-3922-4820-b2fc-7dc65a3dd3e1, resource type is: membership/detail I0829 16:40:18.221807 4346 upstream.go:89] Dispatch message: 39be648a-9cb0-4df4-adbd-5cc92809f2bf I0829 16:40:18.221815 4346 upstream.go:96] Message: 39be648a-9cb0-4df4-adbd-5cc92809f2bf, resource type is: membership/detail W0829 16:40:18.993887 4346 upstream.go:468] message: 63c59d88-ac53-4862-9f10-bd98eae50780 process failure, node dev15 not found I0829 16:40:27.200015 4346 upstream.go:376] message: 76ec0bfc-f525-4023-9086-90690e432f06, pod delete successfully, namespace: aiot-test, name: remote-connect-21-59f9548d8c-mqvcb W0829 16:40:27.255840 4346 messagehandler.go:394] node c1 is deleted, data for node will be cleaned up E0829 16:40:27.255941 4346 ws.go:122] failed to read message, error: read tcp 172.18.185.67:10000->121.35.47.182:51954: use of closed network connection W0829 16:40:27.256016 4346 upstream.go:187] parse message: 8ed7b6b9-0b92-4b55-9de3-6dab0723d8e6 resource type with error: resource type not found I0829 16:40:27.256158 4346 synccontroller.go:148] ObjectSync c1.0566eec0-6352-44d5-a8eb-4ee105f18f34 will be deleted since node c1 has been deleted W0829 16:40:27.256359 4346 messagehandler.go:201] Stop keepalive check for node: c1 panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x17fb062]
goroutine 976557 [running]: github.com/kubeedge/kubeedge/cloud/pkg/cloudhub/handler.(MessageHandle).MessageWriteLoop(0xc000a3edc0, 0xc014447f80, 0xc000a9c100) /root/kubeedge/cloud/pkg/cloudhub/handler/messagehandler.go:441 +0xe2 created by github.com/kubeedge/kubeedge/cloud/pkg/cloudhub/handler.(MessageHandle).ServeConn /root/kubeedge/cloud/pkg/cloudhub/handler/messagehandler.go:308 +0x208