timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.82k stars 852 forks source link

[Bug]: ERROR: cannot create continuous aggregate with integer-based time in another timezone #5300

Open wutuobangbang opened 1 year ago

wutuobangbang commented 1 year ago

What type of bug is this?

Unexpected error

What subsystems and features are affected?

Continuous aggregate

What happened?

I want to create a continuous aggregate summary_daily with integer-based time table ts_kv:

CREATE TABLE IF NOT EXISTS ts_kv (
                                     entity_id uuid NOT NULL,
                                     key int NOT NULL,
                                     ts bigint NOT NULL,
                                     bool_v boolean,
                                     str_v varchar(10000000),
                                     long_v bigint,
                                     dbl_v double precision,
                                     json_v json,
                                     CONSTRAINT ts_kv_pkey PRIMARY KEY (entity_id, key, ts)
);

SELECT create_hypertable('ts_kv', 'ts', chunk_time_interval => 604800000, if_not_exists => true);

CREATE FUNCTION current_microfortnight() RETURNS BIGINT
    LANGUAGE SQL STABLE AS $$
SELECT CAST((1209600 * EXTRACT(EPOCH FROM CURRENT_TIME) / 1000000) AS BIGINT)
$$;

SELECT set_integer_now_func('ts_kv', 'current_microfortnight');

create or replace function time_bucket_range( bucket_width interval, ts timestamptz) RETURNS tstzrange as $$
select tstzrange(time_bucket(bucket_width, ts), time_bucket(bucket_width, ts + bucket_width), '[)');
$$
    LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
CREATE MATERIALIZED VIEW conditions_summary_daily2
    WITH (timescaledb.continuous) AS
SELECT entity_id,
       time_bucket('86400000', ts)         as time,
       sum(COALESCE(dbl_v, long_v, 0.0)) as EC_Day
FROM ts_kv
where key = 100
GROUP BY entity_id, time;

But the timezone is wrong, I want to set the timezone to 'Asia/Shanghai',and I do not know how;

This is What I have try, but i got error:

CREATE MATERIALIZED VIEW conditions_summary_daily
    WITH (timescaledb.continuous) AS
SELECT entity_id,
       time_bucket('1 day', to_timestamp((ts + 28800000) / 1000)),
       sum(COALESCE(dbl_v, long_v, 0.0)) as EC_Day
FROM ts_kv
where key = 223
GROUP BY entity_id, time_bucket('1 day', to_timestamp((ts + 28800000) / 1000));

[0A000] ERROR: time bucket function must reference a hypertable dimension column

CREATE MATERIALIZED VIEW conditions_summary_daily
            WITH (timescaledb.continuous) AS
SELECT entity_id,
       time_bucket('86400000', ts,"offset" := 28800000)         as time,
       sum(COALESCE(dbl_v, long_v, 0.0)) as EC_Day
FROM ts_kv
where key = 223
GROUP BY entity_id, time;

[XX000] ERROR: continuous aggregate view must include a valid time bucket function

How can I create continuous aggregate with integer-based time in another timezone? Thanks

TimescaleDB version affected

2.9.2

PostgreSQL version used

14.6

What operating system did you use?

Docker image timescale/timescaledb:latest-pg14 on Windows10 host

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

[XX000] ERROR: continuous aggregate view must include a valid time bucket function

[0A000] ERROR: time bucket function must reference a hypertable dimension column

How can we reproduce the bug?

CREATE TABLE IF NOT EXISTS ts_kv (
                                     entity_id uuid NOT NULL,
                                     key int NOT NULL,
                                     ts bigint NOT NULL,
                                     bool_v boolean,
                                     str_v varchar(10000000),
                                     long_v bigint,
                                     dbl_v double precision,
                                     json_v json,
                                     CONSTRAINT ts_kv_pkey PRIMARY KEY (entity_id, key, ts)
);

SELECT create_hypertable('ts_kv', 'ts', chunk_time_interval => 604800000, if_not_exists => true);

CREATE FUNCTION current_microfortnight() RETURNS BIGINT
    LANGUAGE SQL STABLE AS $$
SELECT CAST((1209600 * EXTRACT(EPOCH FROM CURRENT_TIME) / 1000000) AS BIGINT)
$$;

SELECT set_integer_now_func('ts_kv', 'current_microfortnight');

create or replace function time_bucket_range( bucket_width interval, ts timestamptz) RETURNS tstzrange as $$
select tstzrange(time_bucket(bucket_width, ts), time_bucket(bucket_width, ts + bucket_width), '[)');
$$
    LANGUAGE SQL IMMUTABLE PARALLEL SAFE;

CREATE MATERIALIZED VIEW conditions_summary_daily2
    WITH (timescaledb.continuous) AS
SELECT entity_id,
       time_bucket('86400000', ts,10000)         as time,
       sum(COALESCE(dbl_v, long_v, 0.0)) as EC_Day
FROM ts_kv
where key = 100
GROUP BY entity_id, time;

CREATE MATERIALIZED VIEW conditions_summary_daily
            WITH (timescaledb.continuous) AS
SELECT entity_id,
       time_bucket('86400000', ts,"offset" := 28800000)         as time,
       sum(COALESCE(dbl_v, long_v, 0.0)) as EC_Day
FROM ts_kv
where key = 223
GROUP BY entity_id, time;

CREATE MATERIALIZED VIEW conditions_summary_daily
    WITH (timescaledb.continuous) AS
SELECT entity_id,
       time_bucket('1 day', to_timestamp((ts + 28800000) / 1000)),
       sum(COALESCE(dbl_v, long_v, 0.0)) as EC_Day
FROM ts_kv
where key = 223
GROUP BY entity_id, time_bucket('1 day', to_timestamp((ts + 28800000) / 1000));

DROP MATERIALIZED VIEW conditions_summary_daily;
konskov commented 1 year ago

hello @wutuobangbang thank you for reaching out. I'm a little confused by the use of 'timezone' in the issue title - I understand that the issue title should mention "offset" instead of "timezone" right? There is no version of integer time_bucket that supports time zone, if you want to use timezone, I would suggest using timestamps instead of integer time.

In the first case, you get this error message ERROR: continuous aggregate view must include a valid time bucket function because even though the time_bucket function you’re using is a valid time_bucket function, in the context of cagg definition it is not supported, so not valid for use in cagg definition. We could show a clearer error message for this case. In the second case you get this error message time bucket function must reference a hypertable dimension column because you are using an expression referencing the ts column and not the ts column itself, and we do not allow an expression in the cagg defnition. It looks like it would be easier for you to work with timestamps instead where it's possible to work with timezones in CAGG definitions. I would also suggest reaching out to us in the community Slack and forum if you haven't already, as I think these are better platforms for discussing and getting help with a particular use-case, as this seems to be.

wutuobangbang commented 1 year ago

hello @wutuobangbang thank you for reaching out. I'm a little confused by the use of 'timezone' in the issue title - I understand that the issue title should mention "offset" instead of "timezone" right? There is no version of integer time_bucket that supports time zone, if you want to use timezone, I would suggest using timestamps instead of integer time.

In the first case, you get this error message ERROR: continuous aggregate view must include a valid time bucket function because even though the time_bucket function you’re using is a valid time_bucket function, in the context of cagg definition it is not supported, so not valid for use in cagg definition. We could show a clearer error message for this case. In the second case you get this error message time bucket function must reference a hypertable dimension column because you are using an expression referencing the ts column and not the ts column itself, and we do not allow an expression in the cagg defnition. It looks like it would be easier for you to work with timestamps instead where it's possible to work with timezones in CAGG definitions. I would also suggest reaching out to us in the community Slack and forum if you haven't already, as I think these are better platforms for discussing and getting help with a particular use-case, as this seems to be.

thank you for your reply Forgive my English is not very good I want to aggregate sum by day. If I directly create continuous aggregation, the time is at 8:00 every day, but I want the time to be at 0:00 every day. I think this is a timezone raleted problem,i'm in shanghai; Or how can i set the offset to reduce 8 hour

jnidzwetzki commented 1 year ago

Hello @wutuobangbang,

Thank you for getting back to us. As @konskov has pointed out, switching to the timestamptz data type could solve the problem because then you could use the PostgreSQL functions for offsetting or timezone conversions. If this is not possible, you could try to adjust the timezone of your PostgreSQL session (e.g., SET TIME ZONE ...). As you mentioned, you are in GMT+8 (China Standard Time), which might explain the 8-hour offset you see in the query results.

For further discussion of this use case, I recommend reaching out to our Community Slack. This is a good platform to discuss particular use cases.