influxdata / flux

Flux is a lightweight scripting language for querying databases (like InfluxDB) and working with data. It's part of InfluxDB 1.7 and 2.0, but can be run independently of those.
https://influxdata.com
MIT License
769 stars 153 forks source link

`window` operations are not pushed down if `option location` is set #5422

Closed matteo-zanoni closed 1 year ago

matteo-zanoni commented 1 year ago

In the docs it is staded that window should be a pushdown function. There seems to be a problem causing window not to be pushdown if option location is set to have proper window allignment in a specific timezone.

For testing I used a range of 30 days on 4 series with points every 15 minutes.

The following query (with no location):

import "profiler"

option profiler.enabledProfilers = ["query", "operator"]

from(bucket: "Cliente")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "data")
  \\ other filters omitted for brevity
  |> aggregateWindow(every: 1d, fn: first)
  |> yield()

Gives a maxAllocated of approximately 800 bytes and looking at the operator profiler I can see that the whole 1quey has been pushed down (I see only one operation):

result table _measurement Type Label Count MinDuration MaxDuration DurationSum MeanDuration
1 profiler/operator *influxdb.readWindowAggregateSource ReadWindowAggregateByTime19 1 2269362 2269362 2269362 2269362

If I set the location:

import "profiler"
import "timezone"

option profiler.enabledProfilers = ["query", "operator"]
option location = timezone.location(name: "Europe/Rome")

from(bucket: "Cliente")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "data")
  \\ other filters omitted for brevity
  |> aggregateWindow(every: 1d, fn: first)
  |> yield()

Gives a maxAllocated of approximately 2.700.000 bytes and looking at the operator profiler I can see that only the filters have been pushed down:

result table _measurement Type Label Count MinDuration MaxDuration DurationSum MeanDuration
1 profiler/operator *influxdb.readFilterSource merged_ReadRange17_filter2_filter3_filter4_filter5_filter6_filter7_filter8_filter9_filter10 1 47373695 47373695 47373695 47373695
1 profiler/operator *universe.fixedWindowTransformation window11 6 2798 12463136 49746537 8291089.5
1 profiler/operator *execute.indexSelectorTransformation first12 126 1654 1633800 2868003 22761.928571428572
1 profiler/operator *table.fillTransformation experimental/table.fill13 126 137 26452 199982 1587.1587301587301
1 profiler/operator *universe.schemaMutationTransformation duplicate14 126 809 33083 434832 3451.0476190476193
1 profiler/operator *universe.fixedWindowTransformation window15 126 910 98258 1029264 8168.761904761905

I expected window to be pushdown regardless of the location. Otherwise it creates a lot of problems with high memory usage.

github-actions[bot] commented 1 year ago

This issue has had no recent activity and will be closed soon.