Open mikecote opened 2 months ago
Pinging @elastic/response-ops (Team:ResponseOps)
I'm aware of this behavior, but I'm happy to revisit it as the decisions we made might've been wrong. We decided to handle it this way because on Serverless, we retry these actions 10 times. This was causing a single incident where a connector action failed to increase the failure count 10x, throwing off our SLOs. So only counting it as a failure the first time made the success to failure ratio more accurate.
I noticed when opening https://github.com/elastic/kibana/issues/180419 (you can use the same steps to reproduce) that the metrics were not incrementing when the action was attempted a second and third time.
I'm not sure if this was by design or if it's a bug in our system so I opened this issue to discuss.