influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.17k stars 3.51k forks source link

When a timezone and a positive interval offset are used in a query, data is lost on the daylight savings boundary #25078

Closed davidby-influx closed 6 days ago

davidby-influx commented 1 week ago

This is actually the second half of this issue which was not detected or fixed at the time. When a GROUP BY query with a positive offset and a time zone crossed an autumnal time change (falling back), data can be dropped in the aggregation.

Create data across November 2024 time change:

Remember to set retention policy to infinite duration to allow writing into the past and future (create the database with infinite duration before invoking inch)

> inch -v -time 5760m -start-time '2024-11-02T00:00:01Z' -t 1 -p 96 -db DST_TEST
> influx
> use stress
Using database stress
> select count(*) from m0 where time < '2024-11-11T00:00:01Z' group by time(1d, 12h) fill(none) TZ('America/Los_Angeles')
name: m0
time                count_v0
----                --------
1730487600000000000 19
1730574000000000000 24
1730664000000000000 20
1730750400000000000 24
1730836800000000000 4
> 

We created 96 points, but only found 91 on them. Remove the time zone, we find them all:

> select count(*) from m0 where time < '2024-11-11T00:00:01Z' group by time(1d, 12h) fill(none)

name: m0
time                count_v0
----                --------
1730462400000000000 12
1730548800000000000 24
1730635200000000000 24
1730721600000000000 24
1730808000000000000 12