Open pmuellr opened 1 year ago
Pinging @elastic/response-ops (Team:ResponseOps)
I just bumped into this myself in my 8.13.2 instance. The following was the Kibana UI alert message:
metrics.alert.threshold:6d774450-ad5d-11ed-838e-91a7b3d66137: execution failed - Rule failed to execute because rule ran after it was disabled.
Is there even a temporary / manual way to recover from this situation? Does one need to manually delete the task or something?
I'm having this issue after upgrading to 8.13.4. Deleting the rule and reimporting the saved object doesn't fix it.
Work around I found. Delete the rule. Wait for however long for the task to be run again and it to die from the rule being deleted. Then reimport the saved object. Now you can enable it.
stack version: 8.5.3
Describe the bug:
Somehow, a rule got marked as disabled - and had it's API key deleted - but the task document still existed. When the rule ran, it produced the message
Looking at the task document, it has
enabled: true
, and the rule hasenabled: false
.Seems we probably added this diagnostic for a case where a rule execution had started, but then the rule was disabled at some point during the execution, so we cancelled the execution.
Feels like we should have a different check, near the start of the execution, where we check to see if the rule is disabled, and if it is, set the
task.enabled
field tofalse
in the task document - the rule should "win" if the task and rule differ in enablement, since the user can change the rule, but not the task.It could still happen that the rule is disabled WHILE it's running, and then the message produced is fine. This feels like it's slightly different, in that we'd check at the very beginning of the run, and for these specific conditions (task enabled, rule disabled).