Open mikecote opened 4 months ago
Usage of Kibana background tasks and alerting rules is continuously growing and we are approaching our scalability ceiling of 3,200 tasks per minute
Increase the overall alerting rule throughput by 10x before Jan ‘25
1. PoC
2. Solve the horizontal scalability limits by allowing more Kibana nodes to run tasks
3. Solve the vertical scalability limits by running more tasks per Kibana node
500ms
4. Work items remaining before rolling out to Serverless
logger.warn
5. Initial rollout to Serverles
6. Work items remaining before 8.16 feature freeze
idle
running
claiming
xpack.task_manager.capacity
7. Optional follow-ups and optimizations after 8.16
8. Blogpost
Pinging @elastic/response-ops (Team:ResponseOps)
Problem Statement
Usage of Kibana background tasks and alerting rules is continuously growing and we are approaching our scalability ceiling of 3,200 tasks per minute
Objective
Increase the overall alerting rule throughput by 10x before Jan ‘25
Goals
Scope
Workstreams
Roadmap for "10x at the framework level" workstream
1. PoC
2. Solve the horizontal scalability limits by allowing more Kibana nodes to run tasks
3. Solve the vertical scalability limits by running more tasks per Kibana node
500ms
kibana#1900594. Work items remaining before rolling out to Serverless
logger.warn
to thrown errors (or something alike) in mget claims strategy so serverless metrics picks them up kibana#1900825. Initial rollout to Serverles
6. Work items remaining before 8.16 feature freeze
idle
state torunning
(skipclaiming
) kibana#184739xpack.task_manager.capacity
kibana#1921857. Optional follow-ups and optimizations after 8.16
8. Blogpost