claranet / terraform-signalfx-detectors

Collection of terraform modules for SignalFx detectors.
Mozilla Public License 2.0
23 stars 32 forks source link

[BUG] pod_phase_status alerts does not clear when the pod was replaced by another #566

Open adrienguyclaranet opened 4 months ago

adrienguyclaranet commented 4 months ago

What is the module? otel-collector_kubernetes-common

What is the detector? pod_phase_status

Describe the bug pod_phase_status alerts are still active even when the pod does not exist anymore

To Reproduce Steps to reproduce the behavior:

  1. Nominal "ok" state of the detector
  2. The pod is in a failed state the alert is raised
  3. The pod is automaticaly recreated by k8s
  4. The alert is still active unless we do a manual clear

Expected behavior The alert should clear itself when the pod does not exist anymore or if the pod just pop-up and dies quickly this detector should not triggers at all

Screenshots

Additional context A local solution has been found : Add .fill(2,duration='1s') in the line : signal = data('k8s.pod.phase', filter=base_filtering and filter('env', 'prod') and filter('sfx_monitored', 'true')).fill(2,duration='1s').publish('signal')

Pull request should come up soon.