Closed hongquan closed 1 year ago
So you cannot query any data older than 2 days?
@svenklemm Right, when I do SELECT * FROM farm_tscondition_10m
, it returns only data from the last 2 days, not older.
Could you please provide a complete self-contained script to reproduce the issue. Your cagg definition references columns not part of your table definitions.
Could you please provide a complete self-contained script to reproduce the issue.
This is the whole Django application. After the application run for a while, I see the issue. I don't know which part can reproduce it. I temporary make the source code public, so you can check: https://gitlab.com/agriconnect/plantinghouse/-/blob/master/farm/models.py
Your cagg definition references columns not part of your table definitions.
Ah, yes. Each tscondition
, tsmeasure
table also have node_id
, crop_id
, room_id
columns. They are not for storing IoT data. They are just foreign key to other tables.
The entire application might be a little too much for us to go through.
To better investigate your problem, let me describe how I understand the issue. You create an empty database and start writing to it, it works fine for two days, and then on the third day the old data start to disappear from the view table. And continuously disappears with time, so that at every point in time only two previous days are seen. Right?
To confirm this and narrow the problem, please connect with psql to the database and run a SELECT like select min(time_start) from farm_tscondition_10m
. If I understood you correctly, every day it should be no more than two days in the past. On the contrary, select min(measured_at) from farm_tscondition
should always return the correct oldest value.
On the other hand, if you're talking about the historical data that you are backfilling after creating the new database, the problem might be in that the continuous aggregate is not refreshed for historical data, because it is created with the WITH NO DATA
option. In that case, you have to manually run refresh_continuous_aggregate
to generate the continuous aggregates for the historical data.
You create an empty database and start writing to it, it works fine for two days, and then on the third day the old data start to disappear from the view table. And continuously disappears with time, so that at every point in time only two previous days are seen. Right?
Right.
Here is the result of min(time_start)
:
# SELECT min(time_start) FROM farm_tscondition_10m;
min
------------------------
2022-08-16 10:00:00+07
(1 row)
Hello @hongquan,
Thank you very much for your response. I am trying to reproduce the problem, but I have not succeeded so far. Would it be possible for you to answer the following questions so I can align my test environment to yours as much as possible?
\d+ farm_tscondition;
and \d+ farm_tsmeasure;
)?select min(time_start) from farm_tscondition_10m
and select min(measured_at) from farm_tscondition
)? @jnidzwetzki
* In the issue description, you mention two hypertables and two continuous aggregates. Are both hypertables / aggregates affected by the problem or only one?
In our prod deployments, only farm_tscondition
has data (we don't have IoT devices to feed data to farm_tsmeasure
yet), so I can only observe that farm_tscondition_10m
is affected.
* You mention that both tables have foreign keys and more attributes as shown in the initial description. Could you provide us with the full definition of these tables along with their foreign keys (i.e., by executing `\d+ farm_tscondition;` and `\d+ farm_tsmeasure;`)?
Table "public.farm_tscondition"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------------------------+--------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
measured_at | timestamp with time zone | | not null | | plain | | |
temperature | double precision | | | | plain | | |
humidity | double precision | | | | plain | | |
brightness | double precision | | | | plain | | |
carbonic | smallint | | | | plain | | |
moisture | double precision | | | | plain | | |
dielectric | double precision | | | | plain | | |
soil_temperature | double precision | | | | plain | | |
soil_electrical_conductivity | double precision | | | | plain | | |
electrical_conductivity | double precision | | | | plain | | |
hydrogen_power | double precision | | | | plain | | |
solution_temperature | double precision | | | | plain | | |
solution_salinity | double precision | | | | plain | | |
crop_id | integer | | | | plain | | |
node_id | integer | | | | plain | | |
room_id | integer | | | | plain | | |
soil_hydrogen_power | double precision | | | | plain | | |
ammonia | double precision | | | | plain | | |
Indexes:
"farm_tscondition_pkey" PRIMARY KEY, btree (measured_at)
"farm_tscondition_crop_id_09a48628" btree (crop_id)
"farm_tscondition_node_id_94d38da4" btree (node_id)
"farm_tscondition_node_index" btree (room_id, node_id, measured_at DESC) WHERE NOT room_id IS NULL AND NOT node_id IS NULL
"farm_tscondition_room_id_ce266840" btree (room_id)
Check constraints:
"farm_tscondition_carbonic_check" CHECK (carbonic >= 0)
Foreign-key constraints:
"farm_tscondition_crop_id_09a48628_fk_farm_crop_id" FOREIGN KEY (crop_id) REFERENCES farm_crop(id) DEFERRABLE INITIALLY DEFERRED
"farm_tscondition_node_id_94d38da4_fk_farm_node_id" FOREIGN KEY (node_id) REFERENCES farm_node(id) DEFERRABLE INITIALLY DEFERRED
"farm_tscondition_room_id_ce266840_fk_farm_room_id" FOREIGN KEY (room_id) REFERENCES farm_room(id) DEFERRABLE INITIALLY DEFERRED
Triggers:
ts_cagg_invalidation_trigger AFTER INSERT OR DELETE OR UPDATE ON farm_tscondition FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.continuous_agg_invalidation_trigger('1')
ts_insert_blocker BEFORE INSERT ON farm_tscondition FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Child tables: _timescaledb_internal._hyper_1_30711_chunk,
_timescaledb_internal._hyper_1_30713_chunk,
_timescaledb_internal._hyper_1_30715_chunk,
_timescaledb_internal._hyper_1_30717_chunk,
_timescaledb_internal._hyper_1_30719_chunk,
_timescaledb_internal._hyper_1_30721_chunk,
_timescaledb_internal._hyper_1_30723_chunk,
_timescaledb_internal._hyper_1_30726_chunk,
_timescaledb_internal._hyper_1_30728_chunk,
_timescaledb_internal._hyper_1_30730_chunk,
_timescaledb_internal._hyper_1_30732_chunk,
_timescaledb_internal._hyper_1_30734_chunk,
_timescaledb_internal._hyper_1_30736_chunk,
_timescaledb_internal._hyper_1_30738_chunk,
_timescaledb_internal._hyper_1_30740_chunk,
_timescaledb_internal._hyper_1_30742_chunk,
_timescaledb_internal._hyper_1_30744_chunk,
_timescaledb_internal._hyper_1_30747_chunk,
_timescaledb_internal._hyper_1_30749_chunk,
_timescaledb_internal._hyper_1_30751_chunk,
_timescaledb_internal._hyper_1_30753_chunk,
_timescaledb_internal._hyper_1_30755_chunk,
_timescaledb_internal._hyper_1_30757_chunk,
_timescaledb_internal._hyper_1_30759_chunk,
_timescaledb_internal._hyper_1_30761_chunk,
_timescaledb_internal._hyper_1_30763_chunk,
_timescaledb_internal._hyper_1_30765_chunk,
_timescaledb_internal._hyper_1_30768_chunk
Access method: heap
Table "public.farm_tsmeasure"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-------------------+--------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
measured_at | timestamp with time zone | | not null | | plain | | |
bird_in | integer | | | | plain | | |
bird_out | integer | | | | plain | | |
voltage | double precision | | | | plain | | |
electric_current | double precision | | | | plain | | |
power | double precision | | | | plain | | |
electrical_energy | double precision | | | | plain | | |
ampli_1_left | double precision | | | | plain | | |
ampli_1_right | double precision | | | | plain | | |
ampli_2_left | double precision | | | | plain | | |
ampli_2_right | double precision | | | | plain | | |
ampli_3_left | double precision | | | | plain | | |
ampli_3_right | double precision | | | | plain | | |
ampli_4_left | double precision | | | | plain | | |
ampli_4_right | double precision | | | | plain | | |
crop_id | integer | | | | plain | | |
node_id | integer | | | | plain | | |
room_id | integer | | | | plain | | |
Indexes:
"farm_tsmeasure_pkey" PRIMARY KEY, btree (measured_at)
"farm_tsmeasure_crop_id_3dfd561d" btree (crop_id)
"farm_tsmeasure_node_id_fe716720" btree (node_id)
"farm_tsmeasure_node_index" btree (room_id, node_id, measured_at DESC) WHERE NOT room_id IS NULL AND NOT node_id IS NULL
"farm_tsmeasure_room_id_675ed512" btree (room_id)
Check constraints:
"farm_tsmeasure_bird_in_check" CHECK (bird_in >= 0)
"farm_tsmeasure_bird_out_check" CHECK (bird_out >= 0)
Foreign-key constraints:
"farm_tsmeasure_crop_id_3dfd561d_fk_farm_crop_id" FOREIGN KEY (crop_id) REFERENCES farm_crop(id) DEFERRABLE INITIALLY DEFERRED
"farm_tsmeasure_node_id_fe716720_fk_farm_node_id" FOREIGN KEY (node_id) REFERENCES farm_node(id) DEFERRABLE INITIALLY DEFERRED
"farm_tsmeasure_room_id_675ed512_fk_farm_room_id" FOREIGN KEY (room_id) REFERENCES farm_room(id) DEFERRABLE INITIALLY DEFERRED
Triggers:
ts_cagg_invalidation_trigger AFTER INSERT OR DELETE OR UPDATE ON farm_tsmeasure FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.continuous_agg_invalidation_trigger('2')
ts_insert_blocker BEFORE INSERT ON farm_tsmeasure FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Child tables: _timescaledb_internal._hyper_2_30712_chunk,
_timescaledb_internal._hyper_2_30714_chunk,
_timescaledb_internal._hyper_2_30716_chunk,
_timescaledb_internal._hyper_2_30718_chunk,
_timescaledb_internal._hyper_2_30720_chunk,
_timescaledb_internal._hyper_2_30722_chunk,
_timescaledb_internal._hyper_2_30724_chunk,
_timescaledb_internal._hyper_2_30727_chunk,
_timescaledb_internal._hyper_2_30729_chunk,
_timescaledb_internal._hyper_2_30731_chunk,
_timescaledb_internal._hyper_2_30733_chunk,
_timescaledb_internal._hyper_2_30735_chunk,
_timescaledb_internal._hyper_2_30737_chunk,
_timescaledb_internal._hyper_2_30739_chunk,
_timescaledb_internal._hyper_2_30741_chunk,
_timescaledb_internal._hyper_2_30743_chunk,
_timescaledb_internal._hyper_2_30745_chunk,
_timescaledb_internal._hyper_2_30748_chunk,
_timescaledb_internal._hyper_2_30750_chunk,
_timescaledb_internal._hyper_2_30752_chunk,
_timescaledb_internal._hyper_2_30754_chunk,
_timescaledb_internal._hyper_2_30756_chunk,
_timescaledb_internal._hyper_2_30758_chunk,
_timescaledb_internal._hyper_2_30760_chunk,
_timescaledb_internal._hyper_2_30762_chunk,
_timescaledb_internal._hyper_2_30764_chunk,
_timescaledb_internal._hyper_2_30766_chunk,
_timescaledb_internal._hyper_2_30769_chunk
Access method: heap
* I would like to understand whether the continuous aggregate is affected by the problem or the entire hypertable. So, could you run both queries that are provided by @akuzm and provide us with the results (i.e., `select min(time_start) from farm_tscondition_10m` and `select min(measured_at) from farm_tscondition`)?
# SELECT min(time_start) FROM farm_tscondition_10m;
min
------------------------
2022-09-11 17:00:00+07
(1 row)
# SELECT min(measured_at) FROM farm_tscondition;
min
-------------------------------
2022-09-11 17:00:04.932698+07
(1 row)
Hello @hongquan,
Thank you very much for all this information. It looks like the data is not only removed from the continuous aggregate; the data is also removed from the underlying hypertable. In addition, it appears that only the data for the last 26 hours is available. Have you ever experienced that the time range in which the data is deleted varies? Is it a fixed time window of 26 hours / 2 days or does it change?
Could you also execute the following query to check that there is no data retention policy configured that deletes data automatically?
SELECT j.hypertable_name,
j.job_id,
config,
schedule_interval,
job_status,
last_run_status,
last_run_started_at,
js.next_start,
total_runs,
total_successes,
total_failures
FROM timescaledb_information.jobs j
JOIN timescaledb_information.job_stats js
ON j.job_id = js.job_id
WHERE j.proc_name = 'policy_retention';
You mention that only the table farm_tscondition
is currently used by your application. Would it be possible to store some test data in the table farm_tsmeasure
to check whether or not this table is also affected by the problem?
@jnidzwetzki
the data is also removed from the underlying hypertable
This is expected, because I have retention policy to remove data older than 1 day from hypertables.
Have you ever experienced that the time range in which the data is deleted varies?
Yes, it varies, but still about 2 days (> 24 hours).
Could you also execute the following query to check that there is not data retention policy configured that deletes data automatically?
This is the result:
hypertable_name | job_id | config | schedule_interval | job_status | last_run_status | last_run_started_at | next_start | total_runs | total_successes | total_failures
------------------+--------+------------------------------------------------+-------------------+------------+-----------------+-------------------------------+-------------------------------+------------+-----------------+----------------
farm_tscondition | 1002 | {"drop_after": "24:00:00", "hypertable_id": 1} | 1 day | Scheduled | Success | 2022-09-13 17:09:59.857615+07 | 2022-09-14 17:10:00.477904+07 | 637 | 600 | 34
farm_tsmeasure | 1003 | {"drop_after": "24:00:00", "hypertable_id": 2} | 1 day | Scheduled | Success | 2022-09-13 17:04:31.73244+07 | 2022-09-14 17:04:32.389057+07 | 634 | 600 | 34
(2 rows)
Would it be possible to store some test data in the table farm_tsmeasure to check whether or not this table is also affected by the problem?
Yes, I will try.
Hello @hongquan,
Thank you for getting back to us. The retention policy explains why the data in the hypertable is removed. For some reason, the continuous aggregate data is also deleted, even though the update policy of the CAGGs seems to be configured correctly and no update should be performed for the deleted data.
It seems that something is refreshing the CAGG for the time interval of the deleted data, which is causing the removal of the CAGG data. In the output of the "SELECT min(...." queries you can see that the time range of the hyper table and the CAGG matches.
Please verify that your refresh policy is configured correctly and no manual updates of the CAGG are performed for the deleted time ranges (e.g., by calling refresh_continuous_aggregate(...)
). You can check the policies by executing the following query:
SELECT j.hypertable_name,
j.job_id,
j.proc_name,
config,
schedule_interval,
job_status,
last_run_status,
last_run_started_at,
js.next_start,
total_runs,
total_successes,
total_failures
FROM timescaledb_information.jobs j
JOIN timescaledb_information.job_stats js
ON j.job_id = js.job_id;
If the problem still persists, please provide us with a self-contained example with all of your configured policies so that we can reproduce the problem. You are currently using TimescaleDB 2.6.1, does this problem also occur with the current TimescaleDB version 2.8.0?
Hi @jnidzwetzki
You can check the policies by executing the following query
This is the result:
hypertable_name | job_id | proc_name | config | schedule_interval | job_status | last_run_status | last_run_started_at | next_start | total_runs | total_successes | total_failures
----------------------------+--------+-------------------------------------+-----------------------------------------------------------------------------+-------------------+------------+-----------------+-------------------------------+-------------------------------+------------+-----------------+----------------
_materialized_hypertable_4 | 1001 | policy_refresh_continuous_aggregate | {"end_offset": "00:20:00", "start_offset": "1 day", "mat_hypertable_id": 4} | 00:20:00 | Scheduled | Success | 2022-09-14 17:38:34.380488+07 | 2022-09-14 17:58:35.304035+07 | 34904 | 32866 | 2038
farm_tscondition | 1002 | policy_retention | {"drop_after": "24:00:00", "hypertable_id": 1} | 1 day | Scheduled | Success | 2022-09-14 17:10:00.479657+07 | 2022-09-15 17:10:01.104891+07 | 638 | 601 | 34
farm_tsmeasure | 1003 | policy_retention | {"drop_after": "24:00:00", "hypertable_id": 2} | 1 day | Scheduled | Success | 2022-09-14 17:04:32.392344+07 | 2022-09-15 17:04:33.054147+07 | 635 | 601 | 34
| 1 | policy_telemetry | | 24:00:00 | Scheduled | Success | 2022-09-14 00:27:02.605195+07 | 2022-09-15 00:27:05.896366+07 | 694 | 594 | 100
please provide us with a self-contained example with all of your configured policies so that we can reproduce the problem
I don't know how to provide such self-contained example. However, I'm opening the source code and you can see my configured policies in https://gitlab.com/agriconnect/plantinghouse/-/blob/master/farm/migrations/0002_tscondition_tsmeasure_views.sql
You are currently using TimescaleDB 2.6.1, does this problem also occur with the current TimescaleDB version 2.8.0
I haven't upgrade to v2.8 yet. But v2.7 also suffer this issue. I will upgrade to v2.8 to see.
Hello @hongquan,
Thanks for executing the query. According to the result of the query, there is only one continuous_aggregate_policy
configured at the moment. The CAGG is refreshed every 20 minutes for the last day ({"end_offset": "00:20:00", "start_offset": "1 day"}
). So, the refresh interval of the CAGG overlaps with the data retention policy that deletes the data after 24 hours.
In this case, also the data of the CAGG is affected by the data retention (see the Documentation for more details about overlapping intervals).
Please decrease the refresh interval of the CAGG or change the data retention policy to keep data for at least two days. For example:
SELECT remove_continuous_aggregate_policy('farm_tscondition_10m');
SELECT add_continuous_aggregate_policy('farm_tscondition_10m',
start_offset => INTERVAL '1 hour',
end_offset => INTERVAL '10 minutes',
schedule_interval => INTERVAL '10 minutes');
SELECT remove_retention_policy('farm_tscondition');
SELECT add_retention_policy('farm_tscondition', INTERVAL '2 days');
It is weird that the CAGG is refreshed every 20 min because I configured the refresh interval to be 10 min, as written in https://gitlab.com/agriconnect/plantinghouse/-/blob/master/farm/migrations/0002_tscondition_tsmeasure_views.sql.
Now I choose to increase the retention policy to 2 days. Let's see...
@jnidzwetzki The issue remains after I upgraded to TimescaleDB v2.8 and increased retention policy to 2 days:
# SELECT min(time_start) FROM farm_tscondition_10m;
min
------------------------
2022-09-16 20:00:00+07
(1 row)
# SELECT min(measured_at) FROM farm_tscondition;
min
-------------------------------
2022-09-16 20:00:01.082967+07
(1 row)
hypertable_name | job_id | proc_name | config | schedule_interval | job_status | last_run_status | last_run_started_at | next_start | total_runs | total_successes | total_failures
----------------------------+--------+-------------------------------------+-----------------------------------------------------------------------------+-------------------+------------+-----------------+-------------------------------+-------------------------------+------------+-----------------+----------------
_materialized_hypertable_4 | 1001 | policy_refresh_continuous_aggregate | {"end_offset": "00:20:00", "start_offset": "1 day", "mat_hypertable_id": 4} | 00:20:00 | Scheduled | Success | 2022-09-19 10:14:59.604224+07 | 2022-09-19 10:35:00.529568+07 | 35239 | 33201 | 2038
farm_tscondition | 1004 | policy_retention | {"drop_after": "2 days", "hypertable_id": 1} | 1 day | Scheduled | Success | 2022-09-18 20:43:33.153523+07 | 2022-09-19 20:43:33.771559+07 | 5 | 5 | 0
farm_tsmeasure | 1003 | policy_retention | {"drop_after": "24:00:00", "hypertable_id": 2} | 1 day | Scheduled | Success | 2022-09-18 17:04:35.499136+07 | 2022-09-19 17:04:36.150056+07 | 639 | 605 | 34
| 1 | policy_telemetry | | 24:00:00 | Scheduled | Success | 2022-09-19 00:27:21.284439+07 | 2022-09-20 00:27:24.536283+07 | 699 | 599 | 100
(4 rows)
Hello @hongquan,
Thank you very much for the update. I am currently trying to reproduce the problem in my local environment. I have created a minimal example based on the information you provided and am now waiting 24 hours to analyze what data is being deleted from the hypertable and the continuous aggregate.
To make sure my environment is very similar to yours, could you send us the output of the following two commands?
\d+ farm_tscondition_10m;
\d+ farm_tsmeasure_10m;
In the output of the jobs
query, you can see that only for one of the hypertables the retention policy is set to 2 days. For the second hypertable, it is still set to 24 hours. Please check if this is intentional and if necessary, change the retention policy for the second hypertable as well.
@jnidzwetzki
\d+ farm_tscondition_10m;
View "public.farm_tscondition_10m"
Column | Type | Collation | Nullable | Default | Storage | Description
----------------------------------+--------------------------+-----------+----------+---------+---------+-------------
node_id | integer | | | | plain |
crop_id | integer | | | | plain |
room_id | integer | | | | plain |
time_start | timestamp with time zone | | | | plain |
avg_temperature | double precision | | | | plain |
avg_humidity | double precision | | | | plain |
avg_brightness | double precision | | | | plain |
avg_carbonic | numeric | | | | main |
avg_ammonia | double precision | | | | plain |
avg_moisture | double precision | | | | plain |
avg_dielectric | double precision | | | | plain |
avg_soil_temperature | double precision | | | | plain |
avg_soil_electrical_conductivity | double precision | | | | plain |
avg_electrical_conductivity | double precision | | | | plain |
avg_hydrogen_power | double precision | | | | plain |
avg_solution_temperature | double precision | | | | plain |
avg_solution_salinity | double precision | | | | plain |
View definition:
SELECT _materialized_hypertable_6.node_id,
_materialized_hypertable_6.crop_id,
_materialized_hypertable_6.room_id,
_materialized_hypertable_6.time_start,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _mate
rialized_hypertable_6.agg_5_5, NULL::double precision) AS avg_temperature,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _mate
rialized_hypertable_6.agg_6_6, NULL::double precision) AS avg_humidity,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _mate
rialized_hypertable_6.agg_7_7, NULL::double precision) AS avg_brightness,
_timescaledb_internal.finalize_agg('pg_catalog.avg(smallint)'::text, NULL::name, NULL::name, '{{pg_catalog,int2}}'::name[], _materialized_h
ypertable_6.agg_8_8, NULL::numeric) AS avg_carbonic,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _mate
rialized_hypertable_6.agg_9_9, NULL::double precision) AS avg_ammonia,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_10_10, NULL::double precision) AS avg_moisture,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_11_11, NULL::double precision) AS avg_dielectric,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_12_12, NULL::double precision) AS avg_soil_temperature,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_13_13, NULL::double precision) AS avg_soil_electrical_conductivity,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_14_14, NULL::double precision) AS avg_electrical_conductivity,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_15_15, NULL::double precision) AS avg_hydrogen_power,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_16_16, NULL::double precision) AS avg_solution_temperature,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_6.agg_17_17, NULL::double precision) AS avg_solution_salinity
FROM _timescaledb_internal._materialized_hypertable_6
WHERE _materialized_hypertable_6.time_start < COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(6)), '-infinity'::timestamp with time zone)
GROUP BY _materialized_hypertable_6.node_id, _materialized_hypertable_6.crop_id, _materialized_hypertable_6.room_id, _materialized_hypertable_6.time_start
UNION ALL
SELECT farm_tscondition.node_id,
farm_tscondition.crop_id,
farm_tscondition.room_id,
time_bucket('00:10:00'::interval, farm_tscondition.measured_at) AS time_start,
avg(farm_tscondition.temperature) AS avg_temperature,
avg(farm_tscondition.humidity) AS avg_humidity,
avg(farm_tscondition.brightness) AS avg_brightness,
avg(farm_tscondition.carbonic) AS avg_carbonic,
avg(farm_tscondition.ammonia) AS avg_ammonia,
avg(farm_tscondition.moisture) AS avg_moisture,
avg(farm_tscondition.dielectric) AS avg_dielectric,
avg(farm_tscondition.soil_temperature) AS avg_soil_temperature,
avg(farm_tscondition.soil_electrical_conductivity) AS avg_soil_electrical_conductivity,
avg(farm_tscondition.electrical_conductivity) AS avg_electrical_conductivity,
avg(farm_tscondition.hydrogen_power) AS avg_hydrogen_power,
avg(farm_tscondition.solution_temperature) AS avg_solution_temperature,
avg(farm_tscondition.solution_salinity) AS avg_solution_salinity
FROM farm_tscondition
WHERE farm_tscondition.node_id IS NOT NULL AND farm_tscondition.room_id IS NOT NULL AND farm_tscondition.measured_at >= COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(6)), '-infinity'::timestamp with time zone)
GROUP BY farm_tscondition.node_id, farm_tscondition.crop_id, farm_tscondition.room_id, (time_bucket('00:10:00'::interval, farm_tscondition.measured_at));
\d+ farm_tsmeasure_10m;
View "public.farm_tsmeasure_10m"
Column | Type | Collation | Nullable | Default | Storage | Description
-----------------------+--------------------------+-----------+----------+---------+---------+-------------
node_id | integer | | | | plain |
crop_id | integer | | | | plain |
room_id | integer | | | | plain |
time_start | timestamp with time zone | | | | plain |
sum_bird_in | bigint | | | | plain |
sum_bird_out | bigint | | | | plain |
avg_voltage | double precision | | | | plain |
avg_electric_current | double precision | | | | plain |
avg_power | double precision | | | | plain |
avg_electrical_energy | double precision | | | | plain |
avg_ampli_1_left | double precision | | | | plain |
avg_ampli_1_right | double precision | | | | plain |
avg_ampli_2_left | double precision | | | | plain |
avg_ampli_2_right | double precision | | | | plain |
avg_ampli_3_left | double precision | | | | plain |
avg_ampli_3_right | double precision | | | | plain |
avg_ampli_4_left | double precision | | | | plain |
avg_ampli_4_right | double precision | | | | plain |
View definition:
SELECT _materialized_hypertable_4.node_id,
_materialized_hypertable_4.crop_id,
_materialized_hypertable_4.room_id,
_materialized_hypertable_4.time_start,
_timescaledb_internal.finalize_agg('pg_catalog.sum(integer)'::text, NULL::name, NULL::name, '{{pg_catalog,int4}}'::name[], _materialized_hypertable_4.agg_5_5, NULL::bigint) AS sum_bird_in,
_timescaledb_internal.finalize_agg('pg_catalog.sum(integer)'::text, NULL::name, NULL::name, '{{pg_catalog,int4}}'::name[], _materialized_hypertable_4.agg_6_6, NULL::bigint) AS sum_bird_out,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_7_7, NULL::double precision) AS avg_voltage,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_8_8, NULL::double precision) AS avg_electric_current,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_9_9, NULL::double precision) AS avg_power,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_10_10, NULL::double precision) AS avg_electrical_energy,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_11_11, NULL::double precision) AS avg_ampli_1_left,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_12_12, NULL::double precision) AS avg_ampli_1_right,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_13_13, NULL::double precision) AS avg_ampli_2_left,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_14_14, NULL::double precision) AS avg_ampli_2_right,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_15_15, NULL::double precision) AS avg_ampli_3_left,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_16_16, NULL::double precision) AS avg_ampli_3_right,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_17_17, NULL::double precision) AS avg_ampli_4_left,
_timescaledb_internal.finalize_agg('pg_catalog.avg(double precision)'::text, NULL::name, NULL::name, '{{pg_catalog,float8}}'::name[], _materialized_hypertable_4.agg_18_18, NULL::double precision) AS avg_ampli_4_right
FROM _timescaledb_internal._materialized_hypertable_4
WHERE _materialized_hypertable_4.time_start < COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(4)), '-infinity'::timestamp with time zone)
GROUP BY _materialized_hypertable_4.node_id, _materialized_hypertable_4.crop_id, _materialized_hypertable_4.room_id, _materialized_hypertable_4.time_start
UNION ALL
SELECT farm_tsmeasure.node_id,
farm_tsmeasure.crop_id,
farm_tsmeasure.room_id,
time_bucket('00:10:00'::interval, farm_tsmeasure.measured_at) AS time_start,
sum(farm_tsmeasure.bird_in) AS sum_bird_in,
sum(farm_tsmeasure.bird_out) AS sum_bird_out,
avg(farm_tsmeasure.voltage) AS avg_voltage,
avg(farm_tsmeasure.electric_current) AS avg_electric_current,
avg(farm_tsmeasure.power) AS avg_power,
avg(farm_tsmeasure.electrical_energy) AS avg_electrical_energy,
avg(farm_tsmeasure.ampli_1_left) AS avg_ampli_1_left,
avg(farm_tsmeasure.ampli_1_right) AS avg_ampli_1_right,
avg(farm_tsmeasure.ampli_2_left) AS avg_ampli_2_left,
avg(farm_tsmeasure.ampli_2_right) AS avg_ampli_2_right,
avg(farm_tsmeasure.ampli_3_left) AS avg_ampli_3_left,
avg(farm_tsmeasure.ampli_3_right) AS avg_ampli_3_right,
avg(farm_tsmeasure.ampli_4_left) AS avg_ampli_4_left,
avg(farm_tsmeasure.ampli_4_right) AS avg_ampli_4_right
FROM farm_tsmeasure
WHERE farm_tsmeasure.node_id IS NOT NULL AND farm_tsmeasure.room_id IS NOT NULL AND farm_tsmeasure.measured_at >= COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(4)), '-infinity'::timestamp with time zone)
GROUP BY farm_tsmeasure.node_id, farm_tsmeasure.crop_id, farm_tsmeasure.room_id, (time_bucket('00:10:00'::interval, farm_tsmeasure.measured_at));
if necessary, change the retention policy for the second hypertable as well.
Ok, I've just lengthened the retention policy for ts_measure
to 2 days.
Hello @hongquan,
Thank you for executing the queries and providing us with the output. I was now able to find out the root cause of the problem.
You have two continuous aggregates (farm_tscondition_10m
and farm_tsmeasure_10m
). You mentioned that only farm_tscondition_10m
is used at the moment and that it is not clear if farm_tsmeasure_10m
is also affected by the problem. In the most recent query output, you can see that the materialized data of farm_tscondition_10m
is stored in the table _materialized_hypertable_6
; the materialized data of the farm_tsmeasure_10m
continuous aggregate is stored in _materialized_hypertable_4
.
According to your job definition, there is only one continuous aggregate refresh policy defined at the moment. This policy refreshes the data of _materialized_hypertable_4
(see {"end_offset": "00:20:00", "start_offset": "1 day", "mat_hypertable_id": 4}
in the job output).
hypertable_name | job_id | proc_name | config | schedule_interval | job_status | last_run_status | last_run_started_at | next_start | total_runs | total_successes | total_failures
----------------------------+--------+-------------------------------------+-----------------------------------------------------------------------------+-------------------+------------+-----------------+-------------------------------+-------------------------------+------------+-----------------+----------------
_materialized_hypertable_4 | 1001 | policy_refresh_continuous_aggregate | {"end_offset": "00:20:00", "start_offset": "1 day", "mat_hypertable_id": 4} | 00:20:00 | Scheduled | Success | 2022-09-19 10:14:59.604224+07 | 2022-09-19 10:35:00.529568+07 | 35239 | 33201 | 2038
farm_tscondition | 1004 | policy_retention | {"drop_after": "2 days", "hypertable_id": 1} | 1 day | Scheduled | Success | 2022-09-18 20:43:33.153523+07 | 2022-09-19 20:43:33.771559+07 | 5 | 5 | 0
farm_tsmeasure | 1003 | policy_retention | {"drop_after": "24:00:00", "hypertable_id": 2} | 1 day | Scheduled | Success | 2022-09-18 17:04:35.499136+07 | 2022-09-19 17:04:36.150056+07 | 639 | 605 | 34
| 1 | policy_telemetry | | 24:00:00 | Scheduled | Success | 2022-09-19 00:27:21.284439+07 | 2022-09-20 00:27:24.536283+07 | 699 | 599 | 100
(4 rows)
So, only the data for farm_tsmeasure_10m
is materialized to disk; the data for farm_tscondition_10m
is never materialized. If you query the continuous aggregate, the data is calculated using real-time aggregation. Once the data is removed from the underlying hypertable by the retention policy, the real-time aggregation can no longer access the data, and the data is not available in the continuous aggregate.
You mentioned in your initial post that there should be a refresh policy configured for farm_tscondition_10m
, but the policy seems not to be present on this system. So please re-execute the command:
SELECT add_continuous_aggregate_policy('farm_tscondition_10m',
start_offset => INTERVAL '1 h',
end_offset => INTERVAL '10 m',
schedule_interval => INTERVAL '10 m');
to create a continuous aggregate refresh policy and to keep the data of the continuous aggregate even if the retention policy removes the data from the underlying hypertable.
After adding CAgg policy for farm_tscondition_10m
, the CAgg data now is kept as expected. Thank you, @jnidzwetzki . This issue is resolved.
This proble comback, even that I have enough CAgg policy:
SELECT j.hypertable_name,
j.job_id,
j.proc_name,
config,
schedule_interval,
job_status,
last_run_status,
last_run_started_at,
js.next_start,
total_runs,
total_successes,
total_failures
FROM timescaledb_information.jobs j
JOIN timescaledb_information.job_stats js
ON j.job_id = js.job_id;
hypertable_name | job_id | proc_name | config | schedule_interval | job_status | last_run_status | last_run_started_at | next_start | total_runs | total_successes | total_failures
----------------------------+--------+-------------------------------------+--------------------------------------------------------------------------------+-------------------+------------+-----------------+-------------------------------+-------------------------------+------------+-----------------+----------------
_materialized_hypertable_5 | 1003 | policy_refresh_continuous_aggregate | {"end_offset": "00:10:00", "start_offset": "01:00:00", "mat_hypertable_id": 5} | 00:10:00 | Scheduled | Success | 2023-04-04 15:10:06.579911+07 | 2023-04-04 15:20:07.305988+07 | 14092 | 14092 | 0
_materialized_hypertable_6 | 1004 | policy_refresh_continuous_aggregate | {"end_offset": "00:10:00", "start_offset": "01:00:00", "mat_hypertable_id": 6} | 00:10:00 | Scheduled | Success | 2023-04-04 15:09:24.597891+07 | 2023-04-04 15:19:24.685044+07 | 172 | 172 | 0
farm_tscondition | 1000 | policy_retention | {"drop_after": "24:00:00", "hypertable_id": 1} | 1 day | Scheduled | Success | 2023-04-03 20:46:09.614576+07 | 2023-04-04 20:46:10.895492+07 | 225 | 216 | 9
farm_tsmeasure | 1001 | policy_retention | {"drop_after": "24:00:00", "hypertable_id": 2} | 1 day | Scheduled | Success | 2023-04-03 20:52:20.479195+07 | 2023-04-04 20:52:21.01931+07 | 225 | 216 | 9
| 1 | policy_telemetry | | 24:00:00 | Scheduled | Success | 2023-04-04 11:54:35.120112+07 | 2023-04-05 11:54:37.701018+07 | 236 | 225 | 11
| 2 | policy_job_error_retention | {"drop_after": "1 month"} | 1 mon | Scheduled | Success | 2023-04-01 07:00:00.003715+07 | 2023-05-01 07:00:00+07 | 5 | 5 | 0
(6 rows)
My software version now is:
$ dpkg -l | rg timescaledb
ii timescaledb-2-loader-postgresql-14 2.10.1~ubuntu22.04 amd64 The loader for TimescaleDB to load individual versions.
ii timescaledb-2-postgresql-14 2.10.1~ubuntu22.04 amd64 An open-source time-series database based on PostgreSQL, as an extension.
ii timescaledb-toolkit-postgresql-14 1:1.15.0~ubuntu22.04 amd64 Library of analytical hyperfunctions, time-series pipelining, and other SQL utilities, compatible with TimescaleDB and PostgreSQL
ii timescaledb-tools 0.14.3~ubuntu22.04 amd64 A suite of tools that can be used with TimescaleDB.
Could you take a look, @jnidzwetzki ?
Solved it by deleting and recreating CAgg policy.
It seems that the problem happen when I recreate the materialized views in DB migration process.
What type of bug is this?
Data corruption
What subsystems and features are affected?
Continuous aggregate
What happened?
I have a website to collect sensor data, where I defines 2 tables:
And hypertables:
There are 2 issues:
Please tell me what info you need to debug further. I can also send the whole data if needed.
TimescaleDB version affected
2.6.1
PostgreSQL version used
12
What operating system did you use?
Ubuntu 20.04 x86
What installation method did you use?
Deb/Apt
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?
No response