Open ashishkurmi opened 1 year ago
Hi,
Based on the issue description, I believe this might have been fixed by https://github.com/cilium/tetragon/pull/1282. Could you try the latest image (quay.io/cilium/tetragon-ci:latest
) and see if it fixes the issue?
Thanks so much @kkourt for fixing this issue, my repro doesn't work with the latest image! I now see non-ASCII characters in the output that I believe were previously causing this issue: 🚀 process default/dind /proc/self/exe init /var ig_map_stats _recursive � �� ȳ�}� � �~�~�@�Q�~�P�Q�~��~�#�~�~� @�?,~�@ h �#1�@�~�
would this fix be merged with the existing Tetragon stable release (v0.10.0)?
would this fix be merged with the existing Tetragon stable release (v0.10.0)?
The change was already backported in v0.10: https://github.com/cilium/tetragon/pull/1285, so it will be part of v0.10.1.
What happened?
Bug Description
When retrieving Tetragon events from the gRPC endpoint using
tetra
CLI, tetra CLI breaks occasionally with the following error message:kubectl exec -it -n kube-system ds/tetragon -c tetragon -- tetra getevents -o compact -n default
time="xxx" level=fatal msg="Failed to receive events" error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8" command terminated with exit code 1
For example: :boom: exit default/dind /usr/local/bin/dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2376 --tlsverify --tlscacert /certs/server/ca.pem --tlscert /certs/server/cert.pem --tlskey /certs/server/key.pem 253 :electric_plug: connect default/dind /usr/local/bin/dockerd tcp 10.0.5.148:42774 -> 34.205.13.154:443 :rocket: process default/dind /usr/local/bin/runc --log /var/lib/docker/buildkit/executor/runc-log.json --log-format json run --bundle /var/lib/docker/buildkit/executor/tyvqwqdjam33m7564no4hhirw tyvqwqdjam33m7564no4hhirw time="2023-07-25T00:40:17Z" level=fatal msg="Failed to receive events" error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8" command terminated with exit code 1
Repro Steps
Please follow these steps to repro the error scenario. These steps work for me on AWS EKS.
Configure Tetragon and start listening for events
Generate Tetragon error events
You can run
docker build .
multiple times in the pod. For me, it consistently generates the error scenario.Tetragon Version
0.10.0
Kernel Version
Linux ip-10-0-51-149.us-west-2.compute.internal 5.10.184-175.731.amzn2.x86_64 #1 SMP Tue Jun 27 21:48:55 UTC 2023 x86_64 Linux
Kubernetes Version
Server Version: version.Info{Major:"1", Minor:"27+", GitVersion:"v1.27.3-eks-a5565ad", GitCommit:"78c8293d1c65e8a153bf3c03802ab9358c0e1a14", GitTreeState:"clean", BuildDate:"2023-06-16T17:32:40Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}
Bugtool
No response
Relevant log output
Anything else?
No response