elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.48k stars 8.05k forks source link

[Research] bsearch investigation #166206

Open thomasneirynck opened 10 months ago

thomasneirynck commented 10 months ago

The kibana Server internal bsearch API is foundational to the functioning of dashboards.

Architecture

e.g. rough schematic

image

Purpose

Areas for improvement

bsearch resolves key constraints, but also introduces new ones. Primarily, it increased pressure on Kibana Server.

Goals

Consider:

elasticmachine commented 10 months ago

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

elasticmachine commented 10 months ago

Pinging @elastic/kibana-presentation (Team:Presentation)

Dosant commented 4 months ago

Linking this relevant investigation. Could be helpfull

thomasneirynck commented 4 months ago

Query-consolidation is an unexplored area, and could benefit from the existing architecture.

vadimkibana commented 3 months ago

A few notes, in case it might help:

thomasneirynck commented 3 months ago

thx @vadimkibana!

wrt synthetics use-case, @dominiqueclarke just informed this usage was removed in 8.10.

thomasneirynck commented 2 months ago

Consider turning bsearch off in just Serverless https://github.com/elastic/kibana/issues/181938

thomasneirynck commented 1 month ago

With https://github.com/elastic/kibana/pull/179663, we have been collecting more telemetry on the overhead of bsearch, specifically the custom encoding part into the line-delimited base64 format.

Metrics:

Long-tail distribution of time spent per single call.

75 percentile sits under 50-60 ms per bsearch call.

image

Given that a single dashboard will have typically 5-6 bsearch calls to fetch data for all charts, we can expect 10s to 100s of milliseconds spent on a single dashboard, just re-encoding the data per single time2data cycle.

Long-tail distribution of total message size

image

time spent scales linearly with message size

Message size and encoding time scales linearly (duh).

Evidence of really large responses.

At the end of the long tail (+95percentile), we find evidence of really large responses (in the order of megabytes)

image

Takeway

Removing time spent re-encoding data in bsearch should be a broad but shallow improvement to overall time. We should expect it to compound positively as well, given the single-threaded nature of nodejs. While we have no metrics on that, given the evidence of large data-responses, removal of this encoding should also reduce memory pressure on the kibana-server at runtime.

Overall, removal will help work towards "thinning" the kibana server footprint, and should yield measurable improvements to time2data (providing kibana-server supports http2 parallelization).

kertal commented 12 hours ago

qq: is the research part done? can this be closed?