airwallex / k8s-pod-restart-info-collector

Automated troubleshooting of Kubernetes Pods issues. Collect K8s pod restart reasons, logs, and events automatically.
314 stars 45 forks source link

feat: Ignore specific errors for given pods #39

Open nikup opened 7 months ago

nikup commented 7 months ago

Add support for ignoring a specific restart error for a specific type of pod. Example: We have a gke-metrics-agent that sometimes can't start because of Failed to run the service: failed to start extensions: listen tcp 127.0.0.1:8203: bind: address already in use. With the current implementation we can ignore that by adding

ignoredErrorsForPodNamePrefixes = "{ "gke-metrics-agent": [ "address already in use" ]}"
ventsislav-georgiev commented 7 months ago

@able8 any chance of getting this in?

able8 commented 7 months ago

Hi @nikup @ventsislav-georgiev, thank you for contributing. I left 2 comments. Also, the Contributor License Agreement (CLA) is required to merge PR. Refer to https://github.com/airwallex/k8s-pod-restart-info-collector/blob/master/CONTRIBUTION.md