Closed shahzad31 closed 2 years ago
Pinging @elastic/uptime (Team:uptime)
Thanks for finding this. It's interesting to me that the issue here is similar to the one I reported about the last successful screenshot. Seems we should be filtering for documents with heartbeat/summary
docs in more places.
Kibana version: master
If you have a monitor with multi steps, and if one step failed and other succeeds, configured alert for that monitor has flip flap between recovered and down state.
This is especially true if you have a timeout in one of the step. like for example if alert interval is 1 minute and failed step execution takes 60 seconds, in that it will flip flop, since alert will only work on one step status,
Main issue is that in the query we are using
monitor.status
field to check downs status of the monitor.This part of the code is the main problem
https://github.com/elastic/kibana/blob/main/x-pack/plugins/uptime/server/lib/requests/get_monitor_status.ts#L61
There are two ways we can fix this, either add a filter for the summary document
We can also update the existing filter to look for summary.down count
Both solutions should be fine.
This is the config i used for reproducing this