rancher / fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
https://fleet.rancher.io/
Apache License 2.0
1.5k stars 219 forks source link

Drop, mark, or wrap transient errors for Fleet pods #632

Open nickgerace opened 2 years ago

nickgerace commented 2 years ago

Users' monitoring and logging services should not receive false alerts for Fleet error logs. We should try to reduce transient errors in pod eviction, restarts, leader election, etc. in order to reduce false alerts.

Related: https://github.com/rancher/fleet/issues/628

nickgerace commented 2 years ago

Not assigning LoE until we learn more about the transient errors we should be dealing with.