elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.7k stars 8.12k forks source link

Scaling the alerting throughput ceiling from 3,200 to 32,000 rules per minute #188194

Open mikecote opened 1 month ago

mikecote commented 1 month ago

Problem Statement

Usage of Kibana background tasks and alerting rules is continuously growing and we are approaching our scalability ceiling of 3,200 tasks per minute

Objective

Increase the overall alerting rule throughput by 10x before Jan ‘25

Goals

Scope

Workstreams

Roadmap for "10x at the framework level" workstream

1. PoC

2. Solve the horizontal scalability limits by allowing more Kibana nodes to run tasks

3. Solve the vertical scalability limits by running more tasks per Kibana node

4. Work items remaining before rolling out to Serverless

5. Initial rollout to Serverles

6. Work items remaining before 8.16 feature freeze

7. Optional follow-ups and optimizations after 8.16

8. Blogpost

elasticmachine commented 1 month ago

Pinging @elastic/response-ops (Team:ResponseOps)