timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.96k stars 883 forks source link

[Bug]: <Title>autovacuum worker took too long to start; canceled #7485

Open yangsr12 opened 1 week ago

yangsr12 commented 1 week ago

What type of bug is this?

Locking issue, Performance issue, Unexpected error

What subsystems and features are affected?

Background worker

What happened?

行 3592: 2024-11-16 07:49:39.740 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
行 3597: 2024-11-16 07:50:09.744 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
行 3599: 2024-11-16 07:50:39.857 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
行 3703: 2024-11-16 07:51:09.878 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
行 3704: 2024-11-16 07:51:39.903 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled

TimescaleDB version affected

2.11.1

PostgreSQL version used

15.2

What operating system did you use?

windows 10

What installation method did you use?

Source

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

2024-11-16 07:47:10.716 HKT [12176] LOG:  connection received: host=127.0.0.1 port=64055
2024-11-16 07:47:10.718 HKT [12176] LOG:  connection authorized: user=dbuser database=yangdb
2024-11-16 07:47:11.909 HKT [43120] LOG:  disconnection: session time: 0:29:51.153 user=dbuser database=yangdb host=127.0.0.1 port=54352
2024-11-16 07:47:18.049 HKT [33820] LOG:  connection received: host=127.0.0.1 port=64092
2024-11-16 07:47:18.052 HKT [33820] LOG:  connection authorized: user=dbuser database=yangdb
2024-11-16 07:48:25.254 HKT [11552] LOG:  disconnection: session time: 0:29:33.416 user=dbuser database=yangdb host=127.0.0.1 port=54863
2024-11-16 07:48:25.303 HKT [32680] LOG:  connection received: host=127.0.0.1 port=64500
2024-11-16 07:48:25.305 HKT [32680] LOG:  connection authorized: user=dbuser database=yangdb
2024-11-16 07:48:27.549 HKT [38952] LOG:  disconnection: session time: 0:29:31.014 user=dbuser database=yangdb host=127.0.0.1 port=54888
2024-11-16 07:49:12.159 HKT [34224] LOG:  disconnection: session time: 0:29:53.108 user=dbuser database=yangdb host=127.0.0.1 port=55007
2024-11-16 07:49:39.740 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
2024-11-16 07:49:55.081 HKT [38632] LOG:  duration: 60003.969 ms  execute <unnamed>: xxx
2024-11-16 07:49:55.081 HKT [38632] DETAIL:  parameters: xxx
2024-11-16 07:50:09.744 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
2024-11-16 07:50:30.538 HKT [35924] LOG:  disconnection: session time: 0:29:16.448 user=dbuser database=yangdb host=127.0.0.1 port=55613
2024-11-16 07:50:39.857 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
2024-11-16 07:50:43.979 HKT [15392] LOG:  disconnection: session time: 0:29:15.354 user=dbuser database=yangdb host=127.0.0.1 port=55691
2024-11-16 07:50:53.402 HKT [43384] LOG:  duration: 120019.136 ms  execute S_2: INSERT INTO data into timescaled table xxx
2024-11-16 07:50:53.402 HKT [43384] DETAIL:  parameters: xxx
2024-11-16 07:50:55.107 HKT [4288] LOG:  duration: 120030.541 ms  bind <unnamed>: xxx
2024-11-16 07:50:55.107 HKT [4288] DETAIL:  parameters: xxx
2024-11-16 07:50:56.227 HKT [39728] LOG:  disconnection: session time: 0:29:41.247 user=dbuser database=yangdb host=127.0.0.1 port=55618
2024-11-16 07:51:03.313 HKT [42372] LOG:  duration: 120028.196 ms  bind <unnamed>: xxx
2024-11-16 07:51:03.313 HKT [42372] DETAIL:  parameters: xxx
2024-11-16 07:51:06.129 HKT [33124] LOG:  duration: 120010.540 ms  execute <unnamed>: insert into xxx
2024-11-16 07:51:06.129 HKT [33124] DETAIL:  parameters: xxx
2024-11-16 07:51:09.878 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
2024-11-16 07:51:39.903 HKT [17948] WARNING:  autovacuum worker took too long to start; canceled
2024-11-16 07:51:48.963 HKT [19020] LOG:  disconnection: session time: 0:29:42.753 user=dbuser database=yangdb host=127.0.0.1 port=55886
2024-11-16 07:51:53.327 HKT [23584] LOG:  duration: 176946.906 ms  bind <unnamed>: select xxxx
2024-11-16 07:51:53.327 HKT [23584] DETAIL:  parameters: xxx
2024-11-16 07:51:53.332 HKT [16224] LOG:  duration: 179782.672 ms  execute S_156: INSERT INTO data into timescaled table xxx
2024-11-16 07:51:53.332 HKT [16224] DETAIL:  parameters: xxx
2024-11-16 07:51:54.032 HKT [37276] LOG:  duration: 180766.301 ms  execute S_25: INSERT INTO data into timescaled table xxx
2024-11-16 07:51:54.032 HKT [8308] DETAIL:  parameters: xxx
2024-11-16 07:51:54.036 HKT [4288] LOG:  disconnection: session time: 0:30:44.031 user=dbuser database=yangdb host=127.0.0.1 port=55575
2024-11-16 07:51:54.039 HKT [38632] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.
2024-11-16 07:51:54.039 HKT [38632] LOG:  disconnection: session time: 0:07:01.574 user=dbuser database=yangdb host=127.0.0.1 port=63324
2024-11-16 07:51:54.072 HKT [33124] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.
2024-11-16 07:51:54.074 HKT [33820] LOG:  duration: 177694.506 ms  bind <unnamed>: SELECT  xxx
2024-11-16 07:51:54.074 HKT [33820] DETAIL:  parameters: xxx
2024-11-16 07:51:54.074 HKT [33124] LOG:  disconnection: session time: 0:29:08.009 user=dbuser database=yangdb host=127.0.0.1 port=56122
2024-11-16 07:51:54.089 HKT [35480] LOG:  duration: 117831.745 ms  bind <unnamed>: SELECT 
2024-11-16 07:51:54.089 HKT [35480] DETAIL:  parameters: xx
2024-11-16 07:51:54.090 HKT [42372] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.
2024-11-16 07:51:54.091 HKT [42372] LOG:  disconnection: session time: 0:30:37.082 user=dbuser database=yangdb host=127.0.0.1 port=55631
2024-11-16 07:51:54.099 HKT [23584] LOG:  could not send data to client: An established connection was aborted by the software in your host machine.

2024-11-16 07:51:54.099 HKT [23584] FATAL:  connection to client lost
2024-11-16 07:51:54.099 HKT [23584] LOG:  disconnection: session time: 0:09:05.001 user=dbuser database=yangdb host=127.0.0.1 port=62648
2024-11-16 07:51:54.102 HKT [37276] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.

2024-11-16 07:51:54.103 HKT [37276] LOG:  disconnection: session time: 0:22:51.139 user=dbuser database=yangdb host=127.0.0.1 port=58142
2024-11-16 07:51:54.103 HKT [33820] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.

2024-11-16 07:51:54.103 HKT [33820] LOG:  disconnection: session time: 0:04:36.086 user=dbuser database=yangdb host=127.0.0.1 port=64092
2024-11-16 07:51:54.103 HKT [35480] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.

2024-11-16 07:51:54.104 HKT [35480] LOG:  disconnection: session time: 0:30:37.575 user=dbuser database=yangdb host=127.0.0.1 port=55627
2024-11-16 07:51:54.107 HKT [37696] LOG:  duration: 167450.492 ms  bind <unnamed>: SELECT xxx
2024-11-16 07:51:54.107 HKT [37696] DETAIL:  parameters: xxx
2024-11-16 07:51:54.111 HKT [37696] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.

2024-11-16 07:51:54.113 HKT [37696] LOG:  disconnection: session time: 0:25:40.782 user=dbuser database=yangdb host=127.0.0.1 port=57236
2024-11-16 07:51:54.197 HKT [8308] LOG:  could not receive data from client: An established connection was aborted by the software in your host machine.

2024-11-16 07:51:54.198 HKT [8308] LOG:  disconnection: session time: 0:21:01.478 user=dbuser database=yangdb host=127.0.0.1 port=58750
2024-11-16 07:51:54.709 HKT [35036] LOG:  connection received: host=127.0.0.1 port=49213
2024-11-16 07:51:54.790 HKT [35036] LOG:  connection authorized: user=dbuser database=yangdb

How can we reproduce the bug?

I'm not sure as well. There have been several instances where autovacuum failed to start, despite the CPU, memory, hard drive, and other resources monitored by Windows being normal at that time. It seems similar to the issue described below: https://github.com/timescale/timescaledb/issues/6730

Is this a known bug in Timescaledb?
erimatnor commented 5 days ago

This is quite an old version of TimescaleDB. I suggest updating to a newer version to see if the problem persists.

yangsr12 commented 5 days ago

This is quite an old version of TimescaleDB. I suggest updating to a newer version to see if the problem persists.

The version updates of TimescaleDB are relatively frequent. Version 2.11.1 was released in June of last year, and now the latest version is already 2.17.2. Rapid version updates indicate that the software is active, which is a good thing. However, for our application, upgrading the version may involve some costs, unless we can confirm this issue caused by TimescaleDB.

Our data volume is relatively large, with approximately 5 million records per hour. The link https://github.com/timescale/timescaledb/issues/6730 mentions that a table with 1 million records may trigger a failure to start autovacuum. From the logs above, it appears that our database not only failed to start autovacuum at that time, but also experienced many slow SQL queries, resulting in the inability to establish new connections to the database.