abahmed / kwatch

:eyes: monitor & detect crashes in your Kubernetes(K8s) cluster instantly
https://kwatch.dev
MIT License
959 stars 75 forks source link

alerts are sent when a container inside an ignored pod crashes #327

Open nwsparks opened 1 month ago

nwsparks commented 1 month ago

Describe the bug

Ignoring a pod does not ignore container crashes within the pod. Given this it is unclear what the purpose of ignoring a pod is.

To Reproduce

Set configuration to ignore a pod and then crash a container within the pod....easiest way is probably exec into the container and exit 1?

Expected behavior

I would expect that ignoring a pod would also ignore all containers within it.

Actual behavior

an alert is sent for the container

time="2024-07-16T12:20:07Z" level=info msg="skipping pod hubble-relay-597464b475-496gx as it is in the ignore pod name list"
time="2024-07-16T12:20:07Z" level=info msg="container only issue hubble-relay hubble-relay-597464b475-496gx hubble-relay-597464b475 Error level=info msg=\"Starting gRPC health server...\" addr=\":4222\" subsys=hubble-relay\nlevel=info msg=\"Starting gRPC server...\" options=\"{peerTarget:hubble-peer.kube-system.svc.cluster.local:443 dialTimeout:5000000000 retryTimeout:30000000000 listenAddress::4245 healthListenAddress::4222 metricsListenAddress: log:0xc00031e2a0 serverTLSConfig:<nil> insecureServer:true clientTLSConfig:0xc0005f0408 clusterName:default insecureClient:false observerOptions:[0x1f0c400 0x1f0c4e0] grpcMetrics:<nil> grpcUnaryInterceptors:[] grpcStreamInterceptors:[]}\" subsys=hubble-relay\nlevel=info msg=\"Received peer change notification\" change notification=\"name:\\\"ip-100-65-3-234.ec2.internal\\\" address:\\\"100.65.3.234\\\" type:PEER_ADDED tls:{server_name:\\\"ip-100-65-3-234-ec2-internal.default.hubble-grpc.cilium.io\\\"}\" subsys=hubble-relay\nlevel=info msg=\"Received peer change notification\" change notification=\"name:\\\"ip-100-65-2-243.ec2.internal\\\" address:\\\"100.65.2.243\\\" type:PEER_ADDED tls:{server_name:\\\"ip-100-65-2-243-ec2-internal.default.hubble-grpc.cilium.io\\\"}\" subsys=hubble-relay\nlevel=info msg=Connecting address=\"100.65.2.243:4244\" hubble-tls=true peer=ip-100-65-2-243.ec2.internal subsys=hubble-relay\nlevel=info msg=Connecting address=\"100.65.3.234:4244\" hubble-tls=true peer=ip-100-65-3-234.ec2.internal subsys=hubble-relay\nlevel=info msg=Connected address=\"100.65.3.234:4244\" hubble-tls=true peer=ip-100-65-3-234.ec2.internal subsys=hubble-relay\nlevel=info msg=Connected address=\"100.65.2.243:4244\" hubble-tls=true peer=ip-100-65-2-243.ec2.internal subsys=hubble-relay\nlevel=info msg=\"Stopping server...\" subsys=hubble-relay\nlevel=warning msg=\"Error while receiving peer change notification; will try again after the timeout has expired\" connection timeout=30s error=\"rpc error: code = Canceled desc = context canceled\" subsys=hubble-relay\nlevel=info msg=\"Server stopped\" subsys=hubble-relay\n 137"
time="2024-07-16T12:20:07Z" level=info msg="sending event: {PodName:hubble-relay-597464b475-496gx ContainerName:hubble-relay Namespace:kube-system Reason:Error Events:[2024-07-16 12:08:57 +0000 UTC] 

Version/Commit

0.9.5