timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.83k stars 852 forks source link

[Bug]: job crash detected, see server logs #7085

Open pgloader opened 5 days ago

pgloader commented 5 days ago

What type of bug is this?

Crash

What subsystems and features are affected?

Background worker

What happened?

job crash detected, see server logs but there was no information in the PostrgeSQL statement log TimescaleDB: 2.15.2 PostgreSQL: 16.3 log_min_error_statement: log

Besides, the message in the job_history would disappear

it was 5 minutes ago =# select * from job_errors; job_id | proc_schema | proc_name | pid | start_time | finish_time | sqlerrcode | err_message --------+------------------------+-------------------------------------+---------+-------------------------------+-------------------------------+------------+------------------------------------- 1003 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 09:06:39.752447-04 | 2024-06-28 09:06:39.752542-04 | | job crash detected, see server logs 1002 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 10:46:50.699781-04 | 2024-06-28 10:46:50.699857-04 | | job crash detected, see server logs 1025 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 06:30:00.006238-04 | 2024-07-01 06:30:00.006365-04 | | job crash detected, see server logs 1023 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 07:00:00.00073-04 | 2024-07-01 07:00:00.000763-04 | | job crash detected, see server logs (4 rows)

Now

select * from job_errors;

job_id | proc_schema | proc_name | pid | start_time | finish_time | sqlerrcode | err_message --------+------------------------+-------------------------------------+---------+-------------------------------+-------------------------------+------------+------------------------------------- 1003 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 09:06:39.752447-04 | 2024-06-28 09:06:39.752542-04 | | job crash detected, see server logs 1002 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 10:46:50.699781-04 | 2024-06-28 10:46:50.699857-04 | | job crash detected, see server logs 1025 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 06:30:00.006238-04 | 2024-07-01 06:30:00.006365-04 | | job crash detected, see server logs 1023 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 09:00:00.008187-04 | 2024-07-01 09:00:00.008324-04 | | job crash detected, see server logs (4 rows)

The message 2024-07-01 06:30:00.006238-04 was gone

TimescaleDB version affected

2.15.2

PostgreSQL version used

16.3

What operating system did you use?

RHEL8.6

What installation method did you use?

Source

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

#                                     select * from job_errors;
 job_id |      proc_schema       |              proc_name              |   pid   |          start_time           |          finish_time          | sqlerrcode |             err_message
--------+------------------------+-------------------------------------+---------+-------------------------------+-------------------------------+------------+-------------------------------------
   1003 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 09:06:39.752447-04 | 2024-06-28 09:06:39.752542-04 |            | job crash detected, see server logs
   1002 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 10:46:50.699781-04 | 2024-06-28 10:46:50.699857-04 |            | job crash detected, see server logs
   1025 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 06:30:00.006238-04 | 2024-07-01 06:30:00.006365-04 |            | job crash detected, see server logs
   1023 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 07:00:00.00073-04  | 2024-07-01 07:00:00.000763-04 |            | job crash detected, see server logs
(4 rows)

How can we reproduce the bug?

# select * from job_errors;
 job_id |      proc_schema       |              proc_name              |   pid   |          start_time           |          finish_time          | sqlerrcode |             err_message
--------+------------------------+-------------------------------------+---------+-------------------------------+-------------------------------+------------+-------------------------------------
   1003 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 09:06:39.752447-04 | 2024-06-28 09:06:39.752542-04 |            | job crash detected, see server logs
   1002 | _timescaledb_functions | policy_refresh_continuous_aggregate | 1116242 | 2024-06-28 10:46:50.699781-04 | 2024-06-28 10:46:50.699857-04 |            | job crash detected, see server logs
   1025 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 06:30:00.006238-04 | 2024-07-01 06:30:00.006365-04 |            | job crash detected, see server logs
   1023 | _timescaledb_functions | policy_refresh_continuous_aggregate | 2128427 | 2024-07-01 09:00:00.008187-04 | 2024-07-01 09:00:00.008324-04 |            | job crash detected, see server logs
(4 rows)