Rollup Search Historical + Live: Data Table won't show rolled up data when mixing sources

alexmaurizio commented 5 years ago

Kibana version: 7.3.0

Elasticsearch version: 7.3.0

Server OS version: Elastic Cloud

Browser version: Chrome 75

Browser OS version: Windows 10

Original install method (e.g. download page, yum, from source, etc.): Elastic Cloud

Describe the bug: As the documentation states the Rollup API has the feature to "merge" both historical rolled up data and live data. This could prove very useful for a time-series data analysis like our case, where we have daily indices for a barrage of client events and then various rollups that syntethize the live data, before ILM triggers the deletion (after a backup of the raw data) of the index.

When experimenting with Kibana and Data Tables (not sure if other visualizations have the same problem, but could imagine they have) I noticed that when you have a mixed up date span that covers both live and historical data, the rolled up data is simply not loaded, allowing only live data to be fetched in group.

I made a drawing (...sorry...) to explain better what happens when using different time ranges and what I would expect to happen.

ElasticSearchRollupSearch

To make sure that this was happening, here's my actual cluster status.

Ingested: >400mil docs Live data: 24 July -> Today Rolled up data: 15 May -> Yesterday (rollup job have a 1d delay on data)

Made 3 different Index Patterns to run tests againsts:

A live-only index pattern
A live+rollup index pattern (that should merge the two datasets)
A rollup-only index pattern

I put a simple data table visualization, splitting on a simple metric which has only 2 cardinality, and set the data table to show the event count for the two categories.

I then grouped the 3 test views in a test dashboard for screenshot simplicity (this happens also in normal visualizations)

Analyzing July (full month of data) just like the drawing I made

If I search for a date range whether ONLY rolled up data exists (live data is deleted), I get correct results (data only on rolled-up and mixed views):

RollupBug-SearchOnRollupOnly

If I search for a date range whether BOTH rolled up and live exists fully, I get correct results (same results on all three visualizations, except some random wandering event I don't care)

RollupBug-SearchOnLiveAndRoll

If I search for a date range whether rolled up does not exist yet (august, sadly), I get correct results (mixed view shows data from the live data: notice they are slighty off because data is being injected live as we speak so the two queries will return slighty off data, and that's perfectly normal)

RollupBug-SearchLiveOnly

If I search for a date range which falls in between the rolled up data and the live data, in this case searching the whole month, only the live data is shown, without "merging" the rolled up data. Basically, the rolled up data is not picked up in the search:

RollupBug-SearchAcrossBoth

I am not sure this is the intended result. For what I understood from the docs, the mixed view should have the same results as the "rollup-only" view, even if it should prefer live data for better result precision. But instead of preferring the live data only for the range when the live data exists, it actually only fetches the live data alone, and it ignores the whole section where only the rolled-up data exists.

Steps to reproduce:

Follow what I wrote in the bug description. This is a bit lenghty

Expected behavior: The mixed-index pattern (which matches 1 rollup index + all the live indices) should correctly report data from both the rolled up and live data, instead it will only fetch data from the live view

Screenshots (if relevant): Included in the issue body

Thanks, Alessandro

EDIT REASON: added test case for search on live-only data (no rollup started yet), it shows correct results

alexmaurizio commented 5 years ago

Update: I just updated the deployment to Elasticsearch 7.3.1 and Kibana 7.3.1 - the bug still exists.

flash1293 commented 3 years ago

@alexmaurizio Sorry for the late response. This looks like an issue on Elasticsearch side. As the report is a little dated, could you have a look whether this is still happening in your system?

alexmaurizio commented 3 years ago

@flash1293 We did not upgrade from 7.3.1 at the moment, and the bug still exists in those versions.

timroes commented 3 years ago

@polyfractal We're not doing any specific logic regarding index patterns when querying for rollup (v1) fields. This looks to me like an issue that exists in Elasticsearch. Are you aware of behavior like this?

elastic / kibana

Rollup Search Historical + Live: Data Table won't show rolled up data when mixing sources #42690