elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
22 stars 143 forks source link

[8.x](backport #5999) Add failureThreshold to elastic-agent self-monitoring config #6090

Closed mergify[bot] closed 5 days ago

mergify[bot] commented 5 days ago

What does this PR do?

Use failure_threshold introduced in https://github.com/elastic/beats/pull/41570 in self-monitoring configuration to avoid elastic-agent reporting DEGRADED if it fails to fetch metrics due to a component starting/stopping. The default value for the failure threshold is set to 2 but it can be configured via config file or fleet policy.

Why is it important?

It is important to avoid a misrepresentation of agent status due to a single metrics fetch erroring out once. See https://github.com/elastic/elastic-agent/issues/5332

Checklist

Disruptive User Impact

How to test this PR locally

Related issues

Questions to ask yourself


This is an automatic backport of pull request #5999 done by Mergify.

elasticmachine commented 5 days ago

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

elastic-sonarqube[bot] commented 5 days ago

Quality Gate failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 40%)

See analysis details on SonarQube