kubeedge / kubeedge

Kubernetes Native Edge Computing Framework (project under CNCF)
https://kubeedge.io
Apache License 2.0
6.69k stars 1.72k forks source link

list-watch Not running properly on edgenode #3445

Open davedavedavid opened 2 years ago

davedavedavid commented 2 years ago

After a service is created on the K8S, the list-watch operation of the edge node is abnormal. After an add operation, there are consecutive delete operations.

What you expected to happen: Services can be added properly on edge nodes.

{"type":"ADDED","object":{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"2021-12-08T01:46:38Z","labels":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"75fb5982-8eb2-45be-b901-ab1e8418dc3c\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:ports":{".":{},"k:{\"port\":7363,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:sessionAffinity":{},"f:type":{}}},"manager":"sedna-gm","operation":"Update","time":"2021-12-08T01:46:38Z"}],"name":"ct-yolo-v5-aggregation","namespace":"default","ownerReferences":[{"apiVersion":"sedna.io/v1alpha1","blockOwnerDeletion":true,"controller":true,"kind":"FederatedLearningJob","name":"ct-yolo-v5","uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c"}],"resourceVersion":"21541222","uid":"5f815bbc-4c47-429e-8b40-a0c1ad58d0ec"},"spec":{"clusterIP":"130.2.251.237","clusterIPs":["130.2.251.237"],"ipFamilies":["IPv4"],"ipFamilyPolicy":"SingleStack","ports":[{"name":"tcp-0","port":7363,"protocol":"TCP","targetPort":7363}],"selector":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}}} {"type":"DELETED","object":{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"2021-12-08T01:46:38Z","labels":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"75fb5982-8eb2-45be-b901-ab1e8418dc3c\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:ports":{".":{},"k:{\"port\":7363,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:sessionAffinity":{},"f:type":{}}},"manager":"sedna-gm","operation":"Update","time":"2021-12-08T01:46:38Z"}],"name":"ct-yolo-v5-aggregation","namespace":"default","ownerReferences":[{"apiVersion":"sedna.io/v1alpha1","blockOwnerDeletion":true,"controller":true,"kind":"FederatedLearningJob","name":"ct-yolo-v5","uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c"}],"resourceVersion":"21541222","uid":"5f815bbc-4c47-429e-8b40-a0c1ad58d0ec"},"spec":{"clusterIP":"130.2.251.237","clusterIPs":["130.2.251.237"],"ipFamilies":["IPv4"],"ipFamilyPolicy":"SingleStack","ports":[{"name":"tcp-0","port":7363,"protocol":"TCP","targetPort":7363}],"selector":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}}} {"type":"DELETED","object":{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"2021-12-08T01:46:38Z","labels":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"75fb5982-8eb2-45be-b901-ab1e8418dc3c\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:ports":{".":{},"k:{\"port\":7363,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:sessionAffinity":{},"f:type":{}}},"manager":"sedna-gm","operation":"Update","time":"2021-12-08T01:46:38Z"}],"name":"ct-yolo-v5-aggregation","namespace":"default","ownerReferences":[{"apiVersion":"sedna.io/v1alpha1","blockOwnerDeletion":true,"controller":true,"kind":"FederatedLearningJob","name":"ct-yolo-v5","uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c"}],"resourceVersion":"21541222","uid":"5f815bbc-4c47-429e-8b40-a0c1ad58d0ec"},"spec":{"clusterIP":"130.2.251.237","clusterIPs":["130.2.251.237"],"ipFamilies":["IPv4"],"ipFamilyPolicy":"SingleStack","ports":[{"name":"tcp-0","port":7363,"protocol":"TCP","targetPort":7363}],"selector":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}}} {"type":"DELETED","object":{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"2021-12-08T01:46:38Z","labels":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"75fb5982-8eb2-45be-b901-ab1e8418dc3c\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:ports":{".":{},"k:{\"port\":7363,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:sessionAffinity":{},"f:type":{}}},"manager":"sedna-gm","operation":"Update","time":"2021-12-08T01:46:38Z"}],"name":"ct-yolo-v5-aggregation","namespace":"default","ownerReferences":[{"apiVersion":"sedna.io/v1alpha1","blockOwnerDeletion":true,"controller":true,"kind":"FederatedLearningJob","name":"ct-yolo-v5","uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c"}],"resourceVersion":"21541222","uid":"5f815bbc-4c47-429e-8b40-a0c1ad58d0ec"},"spec":{"clusterIP":"130.2.251.237","clusterIPs":["130.2.251.237"],"ipFamilies":["IPv4"],"ipFamilyPolicy":"SingleStack","ports":[{"name":"tcp-0","port":7363,"protocol":"TCP","targetPort":7363}],"selector":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}}} {"type":"DELETED","object":{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"2021-12-08T01:46:38Z","labels":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"75fb5982-8eb2-45be-b901-ab1e8418dc3c\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:ports":{".":{},"k:{\"port\":7363,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:sessionAffinity":{},"f:type":{}}},"manager":"sedna-gm","operation":"Update","time":"2021-12-08T01:46:38Z"}],"name":"ct-yolo-v5-aggregation","namespace":"default","ownerReferences":[{"apiVersion":"sedna.io/v1alpha1","blockOwnerDeletion":true,"controller":true,"kind":"FederatedLearningJob","name":"ct-yolo-v5","uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c"}],"resourceVersion":"21541222","uid":"5f815bbc-4c47-429e-8b40-a0c1ad58d0ec"},"spec":{"clusterIP":"130.2.251.237","clusterIPs":["130.2.251.237"],"ipFamilies":["IPv4"],"ipFamilyPolicy":"SingleStack","ports":[{"name":"tcp-0","port":7363,"protocol":"TCP","targetPort":7363}],"selector":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}}} {"type":"DELETED","object":{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"2021-12-08T01:46:38Z","labels":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"managedFields":[{"apiVersion":"v1","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"75fb5982-8eb2-45be-b901-ab1e8418dc3c\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:ports":{".":{},"k:{\"port\":7363,\"protocol\":\"TCP\"}":{".":{},"f:name":{},"f:port":{},"f:protocol":{},"f:targetPort":{}}},"f:selector":{".":{},"f:federatedlearningjob.sedna.io/name":{},"f:federatedlearningjob.sedna.io/uid":{},"f:federatedlearningjob.sedna.io/worker-type":{}},"f:sessionAffinity":{},"f:type":{}}},"manager":"sedna-gm","operation":"Update","time":"2021-12-08T01:46:38Z"}],"name":"ct-yolo-v5-aggregation","namespace":"default","ownerReferences":[{"apiVersion":"sedna.io/v1alpha1","blockOwnerDeletion":true,"controller":true,"kind":"FederatedLearningJob","name":"ct-yolo-v5","uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c"}],"resourceVersion":"21541222","uid":"5f815bbc-4c47-429e-8b40-a0c1ad58d0ec"},"spec":{"clusterIP":"130.2.251.237","clusterIPs":["130.2.251.237"],"ipFamilies":["IPv4"],"ipFamilyPolicy":"SingleStack","ports":[{"name":"tcp-0","port":7363,"protocol":"TCP","targetPort":7363}],"selector":{"federatedlearningjob.sedna.io/name":"ct-yolo-v5","federatedlearningjob.sedna.io/uid":"75fb5982-8eb2-45be-b901-ab1e8418dc3c","federatedlearningjob.sedna.io/worker-type":"aggregation"},"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}}}

How to reproduce it (as minimally and precisely as possible): Add edge nodes to the cluster through kubeedge, create the K8S service, and observe the list-watch of edge nodes. Anything else we need to know?:

Environment:

davedavedavid commented 2 years ago

it would recover with perform "kubectl delete objectsync --all "

llhuii commented 2 years ago

总结: kubeedge自定义资源objectsync存在残余的service对象, 新建同名service后,cloudcore的synccontroller 处理同名不同UID的资源时会一直发送deleted事件, 导致边侧watch会收到删除事件

具体来说:

fisherxu commented 2 years ago

@llhuii it's better to use english :)

llhuii commented 2 years ago

几个问题:

  1. 为什么有残余的objectsync对象? 什么时候清理? 当同名不同UID对象来的时候,为什么没有被清理?
  2. 当objectsync处理同名,不同UID的情况时, 不应该发送新对象的delete事件
  3. edgecore的metaserver处理watch时, 还得考虑UID不一致的情况,不仅仅以下字段 https://github.com/kubeedge/kubeedge/blob/acf3985312fb017707616599e891c83e958cc18f/edge/pkg/metamanager/metaserver/kubernetes/storage/sqlite/imitator/watchhook/hooks.go#L68-L74
stale[bot] commented 1 year ago

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).

huapox commented 1 year ago

mark

victorming666 commented 6 months ago

It seems kubeedge 1.13.0 has fixed this issue. The test of edgemesh cloud-edge tcp echo passed on kubeedge-1.13.0+edgemesh-1.14.0.

biao-lvwan commented 6 months ago

没解决啊

letweare commented 4 months ago

现在还是有这个bug 有新的解决方法了吗