elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.74k stars 8.14k forks source link

[maps] discuss deprecating "Top term" metric for cluster and grid layers #124397

Closed nreese closed 1 year ago

nreese commented 2 years ago

Should "Top term" metric be deprecated for cluster and grid layers?

History

"Top term" metric was added to support blended layer styling. We needed a metric for string fields so that document layers styled by a string field could have similar styling when showing clusters.

There are no string metrics in Elasticsearch. To implement the feature, we are requesting a bucket terms aggregation and taking the top value and the percentage that this top value represents in the data set.

Why "Top term" metric is a problem

Elasticsearch vector tile aggs request parameter only accepts metric aggregations. Since "Top terms" aggregation is a made-up concept in Kibana maps, it does not exist in Elasticsearch. As such, there is no "Top terms" support in Elasticsearch vector tile API.

Lack of support for "Top terms" is blocking migration from geojson to vector tiles for cluster/grid layers at lower resolutions. Cluster and grid sources use vector tiles at "SUPER_FINE" resolution and GeoJson at "COARSE", "FINE", and "MOST_FINE".

New features like hex bins exacerbate the problem.

Solution

There are 2 places to resolve the problem 1) implement "Top terms" in elasticsearch. This is a large amount of work on the Elasticsearch side. Vector tiles do not support any bucket aggregation at the moment. 2) Stop using "Top terms" in Kibana maps in all clustering use cases except with blended layers.

Near-term proposal

1) Hide "Top terms" from metrics UI for hex bins. Implement "hex bins" to use vector tiles at all resolutions. This is similar to implementation of heatmap layer where clustering uses vector tiles at all resolutions. 2) Hide "Top terms" from metrics UI for clusters and grid sources so users can not create any new "Top terms" metrics 3) Update clustering and grid source to use vector tiles at all resolutions when "Top terms" metric is not present. GeoJSON will only be used when "Top terms" metric is present. 4) Gather telemetry on how many cluster and grid sources use "Top terms" metrics

elasticmachine commented 2 years ago

Pinging @elastic/kibana-gis (Team:Geo)

thomasneirynck commented 2 years ago

+1 on removal as a metric for clusters/grids. It seems like an edge-case, that loses some of its importance now blended-layers are no longer the default.

fwiw: "top-terms" would still remain useful as a metric in choropleth layers ("term joins"), and iirc that was the initial reason why top-terms was introduced.

elasticmachine commented 1 year ago

Pinging @elastic/kibana-presentation (Team:Presentation)

nreese commented 1 year ago

In order to provide better transparency of priorities, issues that will not be prioritized within the next 24 months are being closed.

Tracking request in Maps ice box https://github.com/elastic/kibana/issues/154870