zalando / zmon

Real-time monitoring of critical metrics & KPIs via elegant dashboards, Grafana3 visualizations & more
https://demo.zmon.io/
Other
359 stars 48 forks source link

Question: unexpected behaviour when alert code changes result of check? #47

Closed rwitzel closed 5 years ago

rwitzel commented 5 years ago

Given the alert code modified the result of the check (kind of unusual admittedly), when the alert code is changed so the result is no longer modified, and the I cleanup and evaluate the alert, then the alert still shows the modified result in the UI for a while (probably until the check runs regularily).

Why is it like this? It is a bit confusing.

mohabusama commented 5 years ago

The changes of the alert/check conditions do not propagate immediately to the workers/appliance, so the worker could be executing the old alert condition until changes are propagated (this is done via the remote Scheduler in the appliance pulling up-to-date data from ZMON backend)

Jan-M commented 5 years ago

Yes, exactly. As Mohab points out the scheduler caches alerts, checks and entities for 60 seconds by default and refreshes them individually. So it may take a few minutes until checks and alerts are updated anywhere.

rwitzel commented 5 years ago

Thanks!