elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.76k stars 8.17k forks source link

[Lens] Optimize redundant formula function aggregations #135265

Open drewdaemon opened 2 years ago

drewdaemon commented 2 years ago

Describe the feature: When a Lens formula calls the same function more than once, it is currently translated into two identical aggregations in the request to Elasticsearch. Elasticsearch does not optimize this, instead performing the exact same work as many times as there are aggregations.

https://github.com/elastic/kibana/pull/131875 added the flexibility to use a single Elasticsearch aggregation to power multiple Lens dimensions. It also introduced an expression optimization hook on the Operation class. We should be able to use this groundwork to merge all redundant formula function calls into a single aggregation request to Elasticsearch. This will improve performance and lessen cluster load.

Describe a specific use case for the feature:

Logistic function

Screen Shot 2022-06-27 at 4 27 00 PM
elasticmachine commented 2 years ago

Pinging @elastic/kibana-vis-editors @elastic/kibana-vis-editors-external (Team:VisEditors)

drewdaemon commented 2 years ago

@flash1293 just to verify—this is the comprehensive list of operations that need to be optimized?

Screen Shot 2022-09-14 at 2 40 08 PM
flash1293 commented 2 years ago

Yes, that’s right. Technically an unfiltered count is “for free”, but I don’t see how this would simplify things. Also, we already have some special optimization for percentiles we should keep in mind.

drewdaemon commented 1 year ago

I believe the only formula function left to optimize is percentile_rank. To optimize it, we need to follow a similar pattern to what is currently done for percentile.