Open ztdawang opened 1 year ago
Hi @ztdawang, can you confirm that the output above is the full log? If not, please provide the entire thing.
Can you explain what behavior you are expecting to see? My understanding is that this error message in isolation is not a problem (short lived processes/pods do happen and may cause this).
does pixie support cilium dsr mode, i.e. tunnel=disabled??
I haven't used cilium before, but from my brief research I believe Pixie should work with dsr mode. For many of Pixie's visualizations (pxl scripts), we provide the source by its pod name. That association may not work but the protocol tracing should still see all requests and responses to and from a given pod.
The phenomenon is that all of pixie's scripts can't get observability data.
@ztdawang what protocol traffic are you expecting to see? I'm not sure I see anything that points to a debian specific issue. The more details you can provide about what is running on the cluster and what protocol data should be visible will advise where to take a deeper look.
Can't get any data on webui. I deployed pixie in self-host mode.
Are you able to explain what processes you have running on the machine that pixie is compatible with? Knowing what data is missing (pgsql, mysql, http), whether it's encrypted or not and the language it is written in would help guide where we can look into things.
My environment is a new k8s cluster without any application deployed. All pods under the plc and olm namespaces run normally, but under the pl namespace, vizter-pem, which is the agent, reports an error related to the screenshot above and cgroup2. There are two other pods that restart every few tens of minutes because they failed the health check. I can’t remember the name of the pod because I’m not in front of the computer. Now, I only remember the two pods above vizter-pem
At the moment, I don't believe the logs shared above (cgroup2) are indicating a problem. However, the other crashing pods are something to look further into. Would you be able to provide kubectl -n pl get pods
and kubectl describe
the crashing pods?
The biggest problem may be that the data is not transferred properly between the agent and the cloud
$ kubectl -n pl logs -f vizier-cloud-connector-5987cf847-k85rh
time="2023-07-21T21:09:59Z" level=info msg="[core] Channel authority set to \"vzconn-service.plc.svc.cluster.local:51600\"" system=system
time="2023-07-21T21:09:59Z" level=info msg="[core] ccResolverWrapper: sending update to cc: {[{vzconn-service.plc.svc.cluster.local:51600
$ kubectl -n pl describe po vizier-cloud-connector-5987cf847-k85rh Port: 50800/TCP Host Port: 0/TCP State: Running Started: Sat, 22 Jul 2023 05:09:56 +0800 Last State: Terminated Reason: Error Exit Code: 1 Started: Sat, 22 Jul 2023 04:47:25 +0800 Finished: Sat, 22 Jul 2023 05:09:55 +0800 Ready: True Restart Count: 43
$kubectl -n pl logs -f vizier-metadata-dc587cd79-r66bc --previous time="2023-07-21T21:29:55Z" level=info msg="[transport] transport: loopyWriter.run returning. connection error: desc = \"transport is closing\"" system=system E0721 21:32:26.969770 1 leaderelection.go:367] Failed to update lock: resource name may not be empty I0721 21:32:26.969817 1 leaderelection.go:283] failed to renew lease pl/metadata-election: timed out waiting for the condition time="2023-07-21T21:32:33Z" level=warning msg="Leadership lost. This can occur when the K8s API has heavy resource utilization or high network latency and fails to respond within 1875ms. This usually resolves by itself after some time. Terminating to retry..."
$ kubectl -n pl describe po vizier-metadata-dc587cd79-r66bc
Port:
Redeployed the pixie agents again, the result is still the same. the list of Agents displayed on the webui is empty, and there is no data on the LIVE VIEW.
I installed other open-source eBPF-based observability tool and everything works fine. This shows that my k8s cluster is definitely fine.
I installed other open-source eBPF-based observability tool and everything works fine. This shows that my k8s cluster is definitely fine.我安裝了其他基於 eBPF 的開源可觀測性工具,一切正常。這說明我的 k8s 集群肯定沒問題。
Hello @ztdawang could you share about what is tools that you used? I really want to try it too thanks you :)
vizier-pem runs with errors in the following environments: debian 12.0+k8s 1.26+cilium 1.13.4+cgroup v2
W20230720 02:33:23.604429 521629 state_manager.cc:277] Failed to read PID info for pod=f3c1efa9-5f0c-4d7c-b79d-f335e2d4ce7b, cid= [msg=Failed to open file /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf3c1efa9_5f0c_4d7c_b79d_f335e2d4ce7b.slice/cri-containerd-.scope/cgroup.procs]
W20230720 02:33:23.605147 521629 state_manager.cc:277] Failed to read PID info for pod=80f4c2f9-2479-4e6d-95a5-bbcf3a026001, cid=d5115e549992716ada3ae413ae3311be5ad285feb6762ca33f294dce3aa22e5d [msg=Failed to open file /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod80f4c2f9_2479_4e6d_95a5_bbcf3a026001.slice/cri-containerd-d5115e549992716ada3ae413ae3311be5ad285feb6762ca33f294dce3aa22e5d.scope/cgroup.procs]
kubectl -n pl logs -f vizier-metadata-dc587cd79-8c6h9
time="2023-07-20T04:34:49Z" level=info msg="[transport] transport: loopyWriter.run returning. connection error: desc = \"transport is closing\"" system=system
The other question is: does pixie support cilium dsr mode, i.e. tunnel=disabled??