elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.65k stars 8.23k forks source link

Implement client-side updates w/ mget to prune stale docs #155770

Open kobelb opened 1 year ago

kobelb commented 1 year ago

Feature Description

Change the task manager task claiming algorithm to use a _search to retrieve candidate tasks, a _mget to prune the docs whose version number doesn't match, and then a _bulk to claim the tasks. This will increase the background task capacity in Serverless.

Business Value

Increased background task capacity, reducing the COGS for running alerting rules and actions, and providing a lower MTTD/MTTR.

Definition of Done

Phases

  1. https://github.com/elastic/kibana/pull/171677
  2. https://github.com/elastic/kibana/issues/181325
  3. https://github.com/elastic/kibana/issues/181326
  4. https://github.com/elastic/kibana/issues/181327

Implementation: multiple PRs:

elasticmachine commented 1 year ago

Pinging @elastic/response-ops (Team:ResponseOps)

kobelb commented 1 year ago

/cc @pmuellr

pmuellr commented 1 year ago

@kobelb looks like a POC of this is here: https://github.com/elastic/kibana/compare/main...kobelb:kibana:task_clientside_update

Is that right?

kobelb commented 1 year ago

@pmuellr https://github.com/elastic/kibana/compare/main...kobelb:kibana:task_clientside_update was a super early attempt at this that I wouldn't recommend treating as a proof-of-concept for this implementation.

https://github.com/elastic/kibana/pull/150769 minus the parts that do "task partitioning" are closer to what we want here. Happy to discuss further.

mikecote commented 1 year ago

@pmuellr here's a PR (https://github.com/elastic/kibana/pull/157156) to what I did for ON week, it contains some client-side update code that you can pick from as well (minus the cost parts). I removed a little bit of RxJS as well.

If ever we wanted to explore having the search -> update logic in a worker thread, I've POC'ed it here: https://github.com/mikecote/kibana/pull/5.

I've also POC'ed how to skip the claiming phase if ever we wanted to save an update -> https://github.com/mikecote/kibana/pull/6. But it may be useful to not do this if ever we decide to use worker threads..

mikecote commented 9 months ago

Here's a rollout plan after discussing with @kobelb:

Following this plan should allow us to implement the new polling mechanism while mitigating risk.

cc @pmuellr