Open pgloader opened 4 months ago
@pgloader can u please copy/paste the result of the following query?
SELECT * FROM timescaledb_information.job_history ORDER BY start_time;
Hi, I think I could have the same issues with background jobs. In my case, I have this issue with continuous aggregations:
2024-07-31 14:16:48 UTC [51]: [15-1] 66aa4740.33 0 LOG: background worker "Refresh Continuous Aggregate Policy [1613]" (PID 222) was terminated by signal 11: Segmentation fault
2024-07-31 14:16:48 UTC [51]: [16-1] 66aa4740.33 0 DETAIL: Failed process was running: CALL _timescaledb_functions.policy_refresh_continuous_aggregate()
2024-07-31 14:16:48 UTC [51]: [17-1] 66aa4740.33 0 LOG: terminating any other active server processes
After the segfault, the whole Postgres restarts and enters recovery mode for a while. I'm using Timescale 2.15.3
This is the result of the query you asked.
id | job_id | succeeded | proc_schema | proc_name | pid | start_time | finish_time | config | sqlerrcode | err_message |
---|---|---|---|---|---|---|---|---|---|---|
1 | 1001 | false | _timescaledb_functions | policy_compression | 2672843 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:00.356 +0200 | {"hypertable_id": 2, "compress_after": "01:00:00"} | job crash detected, see server logs | |
2 | 1004 | false | _timescaledb_functions | policy_compression | 2672843 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:00.362 +0200 | {"hypertable_id": 5, "compress_after": "01:00:00"} | job crash detected, see server logs | |
3 | 1005 | false | _timescaledb_functions | policy_compression | 2672843 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:00.362 +0200 | {"hypertable_id": 1, "compress_after": "01:00:00"} | job crash detected, see server logs | |
4 | 1010 | false | _timescaledb_functions | policy_retention | 2672843 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:00.362 +0200 | {"drop_after": "30 days", "hypertable_id": 5} | job crash detected, see server logs | |
5 | 1025 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2672843 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:00.362 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
6 | 1028 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2672843 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:00.362 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
7 | 1029 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2672843 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:00.362 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
34 | 1070 | false | _timescaledb_functions | policy_retention | 2673025 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:32.354 +0200 | {"drop_after": "30 days", "hypertable_id": 5} | job crash detected, see server logs | |
35 | 1085 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2673025 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:32.355 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
36 | 1088 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2673025 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:32.355 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
37 | 1089 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2673025 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 09:09:32.355 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
38 | 1004 | false | _timescaledb_functions | policy_compression | 2698766 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:20.310 +0200 | {"hypertable_id": 5, "compress_after": "01:00:00"} | job crash detected, see server logs | |
39 | 1005 | false | _timescaledb_functions | policy_compression | 2698766 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:20.313 +0200 | {"hypertable_id": 1, "compress_after": "01:00:00"} | job crash detected, see server logs | |
40 | 1118 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698766 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:20.314 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
41 | 1121 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698766 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:20.314 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
42 | 1122 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698766 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:20.314 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
43 | 1123 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698766 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:20.314 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
71 | 1151 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698852 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:31.238 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
72 | 1154 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698852 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:31.239 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
73 | 1155 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698852 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:31.239 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
74 | 1156 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698852 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:31.239 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
104 | 1184 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698927 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:40.968 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
105 | 1187 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698927 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:40.969 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
106 | 1188 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698927 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:40.969 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
107 | 1189 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2698927 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 11:17:40.969 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
108 | 1004 | false | _timescaledb_functions | policy_compression | 2771978 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:47:40.848 +0200 | {"hypertable_id": 5, "compress_after": "01:00:00"} | job crash detected, see server logs | |
109 | 1217 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2771978 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:47:40.855 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
110 | 1220 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2771978 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:47:40.855 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
111 | 1221 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2771978 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:47:40.855 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
112 | 1222 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2771978 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:47:40.855 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
141 | 1250 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2772125 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:48:16.886 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
142 | 1253 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2772125 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:48:16.886 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
143 | 1254 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2772125 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:48:16.886 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
144 | 1255 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2772125 | 2000-01-01 01:00:00.000 +0100 | 2024-07-24 19:48:16.886 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
145 | 1253 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2957163 | 2000-01-01 01:00:00.000 +0100 | 2024-07-25 17:56:17.357 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
178 | 1283 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2957254 | 2000-01-01 01:00:00.000 +0100 | 2024-07-25 17:56:28.657 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
179 | 1286 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2957254 | 2000-01-01 01:00:00.000 +0100 | 2024-07-25 17:56:28.658 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
180 | 1287 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2957254 | 2000-01-01 01:00:00.000 +0100 | 2024-07-25 17:56:28.658 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
181 | 1288 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 2957254 | 2000-01-01 01:00:00.000 +0100 | 2024-07-25 17:56:28.659 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
182 | 1286 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3275070 | 2000-01-01 01:00:00.000 +0100 | 2024-07-27 17:11:53.277 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
183 | 1287 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427181 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:31.390 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
216 | 1316 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427265 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:41.907 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
217 | 1319 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427265 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:41.907 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
218 | 1320 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427265 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:41.907 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
219 | 1321 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427265 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:41.907 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
249 | 1349 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427339 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:51.902 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
250 | 1352 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427339 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:51.902 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
251 | 1353 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427339 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:51.902 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
252 | 1354 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3427339 | 2000-01-01 01:00:00.000 +0100 | 2024-07-28 15:41:51.902 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
253 | 1349 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3571752 | 2000-01-01 01:00:00.000 +0100 | 2024-07-29 13:04:27.233 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
254 | 1352 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3746649 | 2000-01-01 01:00:00.000 +0100 | 2024-07-30 14:57:14.466 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
287 | 1382 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3746753 | 2000-01-01 01:00:00.000 +0100 | 2024-07-30 14:57:30.255 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
288 | 1385 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3746753 | 2000-01-01 01:00:00.000 +0100 | 2024-07-30 14:57:30.255 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
289 | 1386 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3746753 | 2000-01-01 01:00:00.000 +0100 | 2024-07-30 14:57:30.255 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
290 | 1382 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3889928 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 12:08:50.999 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
291 | 1386 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3889928 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 12:08:51.001 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
292 | 1415 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3894415 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 12:46:04.729 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
293 | 1418 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3894415 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 12:46:04.730 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
294 | 1419 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3894415 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 12:46:04.730 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
295 | 1420 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3894415 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 12:46:04.730 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
296 | 1448 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907195 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:34:21.743 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
297 | 1451 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907195 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:34:21.743 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
298 | 1452 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907195 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:34:21.744 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
299 | 1453 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907195 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:34:21.744 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
329 | 1481 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907971 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:40:19.588 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
330 | 1484 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907971 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:40:19.589 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
331 | 1485 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907971 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:40:19.589 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
332 | 1486 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3907971 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:40:19.589 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
333 | 1514 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3909582 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:53:52.791 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
334 | 1517 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3909582 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:53:52.791 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
335 | 1518 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3909582 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:53:52.791 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
336 | 1519 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3909582 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 14:53:52.791 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
337 | 1547 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912453 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:00.563 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
338 | 1550 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912453 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:00.563 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
339 | 1551 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912453 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:00.564 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
340 | 1552 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912453 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:00.564 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
370 | 1580 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912538 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:10.703 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
371 | 1583 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912538 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:10.704 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
372 | 1584 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912538 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:10.704 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
373 | 1585 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 3912538 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 15:18:10.704 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs | |
374 | 1613 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 260 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 16:16:49.179 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 18} | job crash detected, see server logs | |
375 | 1616 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 260 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 16:16:49.182 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 16} | job crash detected, see server logs | |
376 | 1617 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 260 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 16:16:49.182 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 24} | job crash detected, see server logs | |
377 | 1618 | false | _timescaledb_functions | policy_refresh_continuous_aggregate | 260 | 2000-01-01 01:00:00.000 +0100 | 2024-07-31 16:16:49.182 +0200 | {"end_offset": "00:00:00", "start_offset": "28 days", "mat_hypertable_id": 28} | job crash detected, see server logs |
@cosimomeli looks like your background job for cagg refresh is leading to segfault then the job_errors output is correct. Would be great if you open another issue related to this segmentation for for further investigation.
SELECT * FROM timescaledb_information.job_history ORDER BY start_time;
875 | 1005 | t | _timescaledb_functions | policy_compression | 1481534 | 2024-10-20 09:15:00.004445-04 | 2024-10-20 09:15:00.016282-04 | {"hypertable_id": 1, "compress_after": "14 days"} | | 876 | 1000 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1504011 | 2024-10-20 09:30:00.003659-04 | 2024-10-20 09:33:19.511167-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 2} | | 877 | 1016 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1504012 | 2024-10-20 09:30:00.004353-04 | 2024-10-20 09:32:59.072322-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 14} | | 878 | 1008 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1504013 | 2024-10-20 09:30:00.00464-04 | 2024-10-20 09:30:15.16198-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 8} | | 879 | 1021 | t | _timescaledb_functions | policy_compression | 1534815 | 2024-10-20 09:45:00.002833-04 | 2024-10-20 09:45:00.013852-04 | {"hypertable_id": 13, "compress_after": "14 days"} | | 880 | 2 | f | _timescaledb_functions | policy_job_error_retention | 1544833 | 2024-10-20 09:52:29.102847-04 | 2024-10-20 09:52:29.109349-04 | {"drop_after": "1 month"} | 42883 | function _timescaledb_functions.policy_job_error_retention(integer, jsonb) does not exist 881 | 1008 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1558331 | 2024-10-20 10:00:00.000606-04 | 2024-10-20 10:00:16.541826-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 8} | | 882 | 1016 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1558332 | 2024-10-20 10:00:00.001381-04 | 2024-10-20 10:03:01.800141-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 14} | | 883 | 1000 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1558333 | 2024-10-20 10:00:00.001708-04 | 2024-10-20 10:03:21.682105-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 2} | | 884 | 1008 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1612695 | 2024-10-20 10:30:00.004533-04 | 2024-10-20 10:30:16.592725-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 8} | | 885 | 1016 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1612696 | 2024-10-20 10:30:00.005279-04 | 2024-10-20 10:33:04.315277-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 14} | | 886 | 1000 | t | _timescaledb_functions | policy_refresh_continuous_aggregate | 1612697 | 2024-10-20 10:30:00.005631-04 | 2024-10-20 10:33:31.534959-04 | {"end_offset": "00:01:00", "start_offset": "01:01:00", "mat_hypertable_id": 2} | | (886 rows)
These are the most recent
@pgloader can u please copy/paste the result of the following query?
SELECT * FROM timescaledb_information.job_history ORDER BY start_time;
Please see the attached outouts crash.txt
Could you take a look at your logs for these failures?
825 | 1016 | f | _timescaledb_functions | policy_refresh_continuous_aggregate | 3882667 | 2024-10-20 01:30:00.002859-04 | 2024-10-20 01:30:00.002877-04 | | | job crash detected, see server logs
826 | 1000 | f | _timescaledb_functions | policy_refresh_continuous_aggregate | 3882667 | 2024-10-20 01:30:00.003178-04 | 2024-10-20 01:30:00.003195-04 | | | job crash detected, see server logs
Some time around 2024-10-20 01:30:00.002877-04
should give us some log lines about what happened here.
Or do the logs not contain anything this time either?
Thanks.
2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_300" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_60" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_10" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.472 EDT @ LOG: deleted 0 row(s) from materialization table "_timescaledb_internal._materialized_hypertable_8"
2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_300" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_60" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_10" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.472 EDT @ LOG: deleted 0 row(s) from materialization table "_timescaledb_internal._materialized_hypertable_8"
Is that possible you to cut your logs from 2024-10-20 01:00:00.000 EDT
to 2024-10-20 02:00:00.000 EDT
and attach here? But please make sure there's no sensible information on it.
Also recently I've made some refactoring on the code that capture and record logs executions and exceptions and would be nice if you can try it out by updating the extension to the 2.17.1.
2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_300" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_60" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_10" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.472 EDT @ LOG: deleted 0 row(s) from materialization table "_timescaledb_internal._materialized_hypertable_8"
Is that possible you to cut your logs from
2024-10-20 01:00:00.000 EDT
to2024-10-20 02:00:00.000 EDT
and attach here? But please make sure there's no sensible information on it.Also recently I've made some refactoring on the code that capture and record logs executions and exceptions and would be nice if you can try it out by updating the extension to the 2.17.1.
Please see the attached 0102.log
2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_300" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_60" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_10" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ] 2024-10-20 01:30:00.472 EDT @ LOG: deleted 0 row(s) from materialization table "_timescaledb_internal._materialized_hypertable_8"
Is that possible you to cut your logs from
2024-10-20 01:00:00.000 EDT
to2024-10-20 02:00:00.000 EDT
and attach here? But please make sure there's no sensible information on it. Also recently I've made some refactoring on the code that capture and record logs executions and exceptions and would be nice if you can try it out by updating the extension to the 2.17.1.Please see the attached 0102.log
A quick grep into your logs showed the following:
/tmp
➜ grep -i 'Continuous Aggregate' 0102.log
2024-10-20 01:00:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_300" in window [ 2024-10-20 00:00:00-04, 2024-10-20 00:30:00-04 ]
2024-10-20 01:00:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_60" in window [ 2024-10-20 00:00:00-04, 2024-10-20 00:30:00-04 ]
2024-10-20 01:00:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_10" in window [ 2024-10-20 00:00:00-04, 2024-10-20 00:30:00-04 ]
2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_300" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ]
2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_60" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ]
2024-10-20 01:30:00.010 EDT @ LOG: continuous aggregate refresh (individual invalidation) on "cag_30m_metric_data_10" in window [ 2024-10-20 00:30:00-04, 2024-10-20 01:00:00-04 ]
2024-10-20 01:30:10.012 EDT @ FATAL: terminating background worker "Refresh Continuous Aggregate Policy [1000]" due to administrator command
2024-10-20 01:30:12.440 EDT @ LOG: background worker "Refresh Continuous Aggregate Policy [1000]" (PID 621682) exited with exit code 1
2024-10-20 01:30:12.445 EDT @ FATAL: terminating background worker "Refresh Continuous Aggregate Policy [1016]" due to administrator command
2024-10-20 01:30:15.075 EDT @ LOG: background worker "Refresh Continuous Aggregate Policy [1016]" (PID 621681) exited with exit code 1
Looks like another process cancel the execution of the job so the error history is correct.
Most likely conflicted with the pg_dump backup. Should we manually re-run these aggregations?
Most likely conflicted with the pg_dump backup. Should we manually re-run these aggregations?
If you have successful executions after the failure then you're safe since the next execution will process all invalidation logs created even it it was before the window range executed by policy. The downside of it is that it will take more time to refresh because it have more buckets to aggregate.
What type of bug is this?
Crash
What subsystems and features are affected?
Background worker
What happened?
job crash detected, see server logs but there was no information in the PostrgeSQL statement log TimescaleDB: 2.15.2 PostgreSQL: 16.3 log_min_error_statement: log
Besides, the message in the job_history would disappear
it was 5 minutes ago =# select * from job_errors; job_id | proc_schema | proc_name | pid | start_time | finish_time | sqlerrcode | err_message --------+------------------------+-------------------------------------+---------+-------------------------------+-------------------------------+------------+------------------------------------- 1003 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 09:06:39.752447-04 | 2024-06-28 09:06:39.752542-04 | | job crash detected, see server logs 1002 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 10:46:50.699781-04 | 2024-06-28 10:46:50.699857-04 | | job crash detected, see server logs 1025 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 06:30:00.006238-04 | 2024-07-01 06:30:00.006365-04 | | job crash detected, see server logs 1023 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 07:00:00.00073-04 | 2024-07-01 07:00:00.000763-04 | | job crash detected, see server logs (4 rows)
Now
select * from job_errors;
job_id | proc_schema | proc_name | pid | start_time | finish_time | sqlerrcode | err_message --------+------------------------+-------------------------------------+---------+-------------------------------+-------------------------------+------------+------------------------------------- 1003 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 09:06:39.752447-04 | 2024-06-28 09:06:39.752542-04 | | job crash detected, see server logs 1002 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 10:46:50.699781-04 | 2024-06-28 10:46:50.699857-04 | | job crash detected, see server logs 1025 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 06:30:00.006238-04 | 2024-07-01 06:30:00.006365-04 | | job crash detected, see server logs 1023 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 09:00:00.008187-04 | 2024-07-01 09:00:00.008324-04 | | job crash detected, see server logs (4 rows)
The message 2024-07-01 06:30:00.006238-04 was gone
TimescaleDB version affected
2.15.2
PostgreSQL version used
16.3
What operating system did you use?
RHEL8.6
What installation method did you use?
Source
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
How can we reproduce the bug?