kmesh-net / kmesh

High Performance ServiceMesh Data Plane Based on Programmable Kernel
https://kmesh.net
Apache License 2.0
362 stars 46 forks source link

Unexpected error log #405

Open hzxuzhonghu opened 1 month ago

hzxuzhonghu commented 1 month ago

What happened:

root@kurator-linux-0001:~/go/src/kmesh.net/kmesh# k logs kmesh-j6tfz  -n kmesh-system
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --bpf-fs-path=\"/sys/fs/bpf\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --cgroup2-path=\"/mnt/kmesh_cgroup2\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --cni-etc-path=\"/etc/cni/net.d\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --conflist-name=\"\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --enable-bypass=\"false\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --enable-mda=\"false\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --help=\"false\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --mode=\"workload\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="FLAG: --plugin-cni-chained=\"true\"" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="bpf Start successful" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="service node ztunnel~10.244.0.219~kmesh-j6tfz.kmesh-system~kmesh-system.svc.cluster.local connect to discovery address istiod.istio-system.svc:15012" subsys=controller/config
time="2024-05-31T02:51:47Z" level=info msg="controller Start successful" subsys=manager
time="2024-05-31T02:51:47Z" level=info msg="start write CNI config\n" subsys="cni installer"
time="2024-05-31T02:51:47Z" level=info msg="kmesh cni use chained\n" subsys="cni installer"
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/bookinfo-reviews-istio-waypoint-86b5c9cd7d-fhfkc not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/productpage-v1-675fc69cf-q2h9b not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/reviews-v2-5b667bcbf8-cscb8 not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/istio-system/istiod-57dfcb6555-sw277 not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/kmesh-system/kmesh-j6tfz not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/kube-system/coredns-76f75df574-kcvsb not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/kube-system/coredns-76f75df574-r8pxv not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/local-path-storage/local-path-provisioner-7577fdbbfb-lhqqv not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/reviews-v3-5b9bd44f4-vzrvp not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/details-v1-698d88b-m2kh7 not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/reviews-v1-5b5d6494f4-c5n7n not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/istio-system/istio-ingressgateway-7bf844ffb6-w9rfc not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/istio-system/ztunnel-5hftj not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/sleep-7656cf8794-fcqp7 not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/ratings-v1-6484c4d9bb-k7r5d not found" subsys=workload_controller
time="2024-05-31T02:51:47Z" level=error msg="workload Kubernetes//Pod/default/httpbin-86b8ffc5ff-czmdb not found" subsys=workload_controller
time="2024-05-31T02:51:49Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:49Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:49Z" level=error msg="fetcheCert spiffe:///ns/kube-system/sa/coredns error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:49Z" level=error msg="fetcheCert spiffe:///ns/default/sa/bookinfo-productpage error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:50Z" level=info msg="Copied /usr/bin/kmesh-cni to /opt/cni/bin." subsys="cni installer"
time="2024-05-31T02:51:50Z" level=info msg="kubeconfig either does not exist or is out of date, writing a new one" subsys="cni installer"
time="2024-05-31T02:51:50Z" level=info msg="wrote kubeconfig file /etc/cni/net.d/kmesh-cni-kubeconfig" subsys="cni installer"
time="2024-05-31T02:51:50Z" level=info msg="cni config file: /etc/cni/net.d/10-kindnet.conflist" subsys="cni installer"
time="2024-05-31T02:51:50Z" level=info msg="command Start cni successful" subsys=manager
time="2024-05-31T02:51:50Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:50Z" level=error msg="create certificate: rpc error: code = Canceled desc = grpc: the client connection is closing reconnect..." subsys=security
time="2024-05-31T02:51:50Z" level=error msg="fetcheCert spiffe:///ns/default/sa/bookinfo-details error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:50Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:50Z" level=error msg="create certificate: rpc error: code = Canceled desc = grpc: the client connection is closing reconnect..." subsys=security
time="2024-05-31T02:51:50Z" level=error msg="fetcheCert spiffe:///ns/kmesh-system/sa/kmesh error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:50Z" level=error msg="fetcheCert spiffe:///ns/default/sa/httpbin error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:50Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:50Z" level=error msg="fetcheCert spiffe:///ns/local-path-storage/sa/local-path-provisioner-service-account error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:51Z" level=error msg="fetcheCert spiffe:///ns/istio-system/sa/istio-ingressgateway-service-account error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:51Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:51Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:51Z" level=error msg="fetcheCert spiffe:///ns/istio-system/sa/istiod error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:51Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:51Z" level=error msg="fetcheCert spiffe:///ns/default/sa/bookinfo-reviews error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:51Z" level=error msg="fetcheCert spiffe:///ns/default/sa/bookinfo-ratings error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:51Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:51Z" level=error msg="fetcheCert spiffe:///ns/istio-system/sa/ztunnel error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:51Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:51Z" level=error msg="fetcheCert spiffe:///ns/default/sa/bookinfo-reviews-istio-waypoint error: failed to get certChainPEM" subsys=security
time="2024-05-31T02:51:52Z" level=error msg="create certificate: rpc error: code = Unauthenticated desc = request impersonation authentication failure reconnect..." subsys=security
time="2024-05-31T02:51:52Z" level=error msg="fetcheCert spiffe:///ns/default/sa/sleep error: failed to get certChainPEM" subsys=security

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

hzxuzhonghu commented 1 month ago

@lec-bit Can you take a look

hzxuzhonghu commented 1 month ago

Istiod log

2024-05-31T03:33:39.615011Z info    ads Push debounce stable[10] 1 for config Address//Kubernetes//Pod/kmesh-system/kmesh-756l6: 104.422892ms since last change, 104.422763ms since last push, full=false
2024-05-31T03:33:39.615136Z info    ads XDS: Incremental Pushing ConnectedEndpoints:4 Version:2024-05-31T03:30:16Z/3
2024-05-31T03:33:41.058828Z warn    serverca    impersonation failed for identity spiffe:///ns/istio-system/sa/ztunnel, error: no node authorizer for cluster   client=10.244.0.252:60882
2024-05-31T03:33:41.667091Z warn    serverca    impersonation failed for identity spiffe:///ns/istio-system/sa/ztunnel, error: no node authorizer for cluster   client=10.244.0.252:60896
2024-05-31T03:33:41.669378Z warn    serverca    impersonation failed for identity spiffe:///ns/default/sa/sleep, error: no node authorizer for cluster  client=10.244.0.252:60896
2024-05-31T03:33:41.669574Z warn    serverca    impersonation failed for identity spiffe:///ns/default/sa/bookinfo-productpage, error: no node authorizer for cluster   client=10.244.0.252:60896
2024-05-31T03:33:41.669738Z warn    serverca    impersonation failed for identity spiffe:///ns/default/sa/bookinfo-reviews, error: no node authorizer for cluster   client=10.244.0.252:60896
2024-05-31T03:33:42.355468Z warn    serverca    impersonation failed for identity spiffe:///ns/default/sa/bookinfo-reviews, error: no node authorizer for cluster   client=10.244.0.252:60908
2024-05-31T03:33:42.355718Z warn    serverca    impersonation failed for identity spiffe:///ns/default/sa/httpbin, error: no node authorizer for cluster    client=10.244.0.252:60908
2024-05-31T03:33:42.355883Z warn    serverca    impersonation failed for identity spiffe:///ns/default/sa/bookinfo-details, error: no node authorizer for cluster   client=10.244.0.252:60908
2024-05-31T03:33:42.568300Z warn    serverca    impersonation failed for identity spiffe:///ns/default/sa/bookinfo-details, error: no node authorizer for cluster   client=10.244.0.252:60914
2024-05-31T03:33:42.569249Z warn    serverca    impersonation failed for identity spiffe:///ns/kube-system/sa/coredns, error: no node authorizer for cluster    client=10.244.0.252:60914
2024-05-31T03:33:42.655705Z warn    serverca    impersonation failed for identity spiffe:///ns/kube-system/sa/coredns, error: no node authorizer for cluster    client=10.244.0.252:60916
2024-05-31T03:33:42.657634Z warn    serverca    impersonation failed for identity spiffe:///ns/istio-system/sa/istiod, error: no node authorizer for cluster    client=10.244.0.252:60916
2024-05-31T03:33:42.665555Z warn    serverca    impersonation failed for identity spiffe:///ns/istio-system/sa/istiod, error: no node authorizer for cluster    client=10.244.0.252:60932
lec-bit commented 1 month ago

The query operation is performed before the add operation. As a result, the query fails and a large number of errors are reported.

carlory commented 3 weeks ago

https://github.com/kmesh-net/kmesh/blob/76f66dfed7f5e47a8c373c457bc88c7e2fb69ddb/pkg/controller/security/caclient.go#L91

The ctx doesn't contain ClusterID key.

FYI https://github.com/istio/istio/blob/master/security/pkg/nodeagent/caclient/providers/citadel/client.go#L107C59-L107C105

hzxuzhonghu commented 3 weeks ago

Fixed it in https://github.com/kmesh-net/kmesh/pull/416