risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.05k stars 579 forks source link

bug: window TVF can not handle month and year interval correctly #19075

Open st1page opened 3 weeks ago

st1page commented 3 weeks ago
dev=> create table t1 (v int, created_at timestamp);
insert into t1 values
    (1, '2021-01-01 10:00:00'),
    (2, '2000-01-01 10:05:00'),
    (3, '2012-01-01 10:14:00'),
    (4, '2042-01-01 10:22:00');
CREATE_TABLE
INSERT 0 4

dev=> select * from tumble(t1, created_at, interval '1 year');
 v |     created_at      |    window_start     |     window_end
---+---------------------+---------------------+---------------------
 3 | 2012-01-01 10:14:00 | 2011-05-26 00:00:00 | 2012-05-26 00:00:00
 1 | 2021-01-01 10:00:00 | 2020-04-08 00:00:00 | 2021-04-08 00:00:00
 2 | 2000-01-01 10:05:00 | 1999-07-28 00:00:00 | 2000-07-28 00:00:00
 4 | 2042-01-01 10:22:00 | 2041-12-14 00:00:00 | 2042-12-14 00:00:00
(4 rows)

dev=> select * from tumble(t1, created_at, interval '1 month');
 v |     created_at      |    window_start     |     window_end
---+---------------------+---------------------+---------------------
 3 | 2012-01-01 10:14:00 | 2011-12-22 00:00:00 | 2012-01-22 00:00:00
 4 | 2042-01-01 10:22:00 | 2041-12-14 00:00:00 | 2042-01-14 00:00:00
 1 | 2021-01-01 10:00:00 | 2020-12-04 00:00:00 | 2021-01-04 00:00:00
 2 | 2000-01-01 10:05:00 | 1999-12-25 00:00:00 | 2000-01-25 00:00:00
(4 rows)
fuyufjh commented 5 days ago

This case should probably use date_trunc()?

st1page commented 5 days ago

This case should probably use date_trunc()?

I recommend user use that, but I'm not sure there will be a window length that more than one. If we're not sure of the appropriate behavior, I guess we can ban them first so they don't get misused