influxdata / flux

Flux is a lightweight scripting language for querying databases (like InfluxDB) and working with data. It's part of InfluxDB 1.7 and 2.0, but can be run independently of those.
https://influxdata.com
MIT License
767 stars 153 forks source link

puzzled time with aggregateWindow() function #5485

Open ccbellflower opened 4 months ago

ccbellflower commented 4 months ago

Raw data are inserted every 30s. so when the interval param, every, is set in minutes, every interval should have value, the time of which .But that's not.

ccbellflower commented 4 months ago

code referred to as below: from(bucket: "bucket") |> range(start: 2024-05-29T16:01:00.000Z, stop: 2024-05-29T16:30:00.000Z) |> filter(fn: (r) => r["_measurement"] == "measurement") |> filter(fn: (r) => r["_field"] == "pi01") |> aggregateWindow(every: 8m, fn: last, createEmpty: false, timeSrc: "_time", // offset: 1m ) |> yield(name: "last")

sanderson commented 4 months ago

@ccbellflower Window boundaries originate at the Unix epoch and are calculated based on that starting point. So window boundaries will be started every 7-8 minutes starting at 1970-01-01T00:00:00Z. I'm curious to see what your actual window boundaries are in your query. Run the following to get the start boundary of each window:

from(bucket: "bucket")
    |> range(start: 2024-05-29T16:01:00.000Z, stop: 2024-05-29T16:30:00.000Z)
    |> filter(fn: (r) => r["_measurement"] == "measurement")
    |> filter(fn: (r) => r["_field"] == "pi01")
    |> window(every: 8m, createEmpty: true)
    |> unique(column: "_start")
ccbellflower commented 4 months ago

@sanderson could you please check data in this link, because i have failed to post screenshoots on this comment. And i dont think wind boundaries based on starting point referring to this link

sanderson commented 4 months ago

@ccbellflower Thanks for the link to the community post. When I query for the actual window boundaries that are being used in your query (limited to the time range you defined), I get the following:

7m windows

_start _stop
2024-05-29T16:01:00Z 2024-05-29T16:04:00Z
2024-05-29T16:04:00Z 2024-05-29T16:11:00Z
2024-05-29T16:11:00Z 2024-05-29T16:18:00Z
2024-05-29T16:18:00Z 2024-05-29T16:25:00Z
2024-05-29T16:25:00Z 2024-05-29T16:30:00Z

8m windows

_start _stop
2024-05-29T16:01:00Z 2024-05-29T16:08:00Z
2024-05-29T16:08:00Z 2024-05-29T16:16:00Z
2024-05-29T16:16:00Z 2024-05-29T16:24:00Z
2024-05-29T16:24:00Z 2024-05-29T16:30:00Z

The last function you're using in aggregateWindow() is a selector function that doesn't change the timestamp of the row that is returned for each window. The timestamp in your results is the original timestamp associated with the last row in each window. To modify the timestamp for each window, you need to use an aggregate function or define a custom function that replaces the timestamp of the last row with the boundary of the window.

ccbellflower commented 3 months ago

@sanderson thanks for your reply. But what I am describing is that the time returned is not the original timestamp associated with the last row in each wind. data are inserted into every 30s, so there is one piece of data at least every minute. So when every is 7m and timeSrc is _time, 2024-05-30 00:06:59GMT+8 is the actual data rather than 2024-05-30 00:03:59GMT+8.

github-actions[bot] commented 1 month ago

This issue has had no recent activity and will be closed soon.

ccbellflower commented 1 month ago

这是来自QQ邮箱的假期自动回复邮件。您好,您的邮件我已收到,会尽快进行处理,谢谢。                                    陈艺