Closed lnagel closed 7 months ago
Considering you are using a router, it's likely you are using a USB attached storage? It's likely that the database has gotten too large that the increased IO latency is too much for the application to handle.
Considering you are using a router, it's likely you are using a USB attached storage? It's likely that the database has gotten too large that the increased IO latency is too much for the application to handle.
It was running USB attached storage, but it's a fast drive (DataTraveler Max USB 3.2 Gen 2), it wasn't even utilizing the disk too much.
I moved the container with its data folder to a 8th gen Intel NUC running the latest Docker on a dedicated NVME drive hosting the docker filesystem. The issue persists unfortunately.
Is there any advice how to reset the app's accumulated data history without losing all 89 configured monitors? I am mostly using HTTP, PING and MQTT checks.
That's very strange.
If you have worked with databases before, you can stop the application, and open the database file data.db
in any SQLite compatible application (dbeaver, beekeeper, etc). You can then manually delete old rows in the heartbeat
table.
Hello, I seem to have a similar problem.
This morning, without having received any particular notification, I saw that all the monitors had a problem during the night, in the docker logs, the line Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
is present on all monitors.
I'm on a VPS running AlmaLinux 8.9 (Midnight Oncilla) with a 4.18.0-513.5.1.el8_9.x86_64 kernel. Uptime kuma version: 1.23.8 and Docker version 24.0.7, build afdd53b
Regards,
That's very strange.
If you have worked with databases before, you can stop the application, and open the database file
data.db
in any SQLite compatible application (dbeaver, beekeeper, etc). You can then manually delete old rows in theheartbeat
table.
Thanks. I did the following which reduced the size of the database from 297M down to 252K.
root@nuc:/var/lib/docker/volumes/uptime-kuma_data/_data# sqlite3 kuma.db
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
sqlite> delete from heartbeat;
sqlite> vacuum;
sqlite>
At least uptime-kuma will start up now, no errors in the logs so far, let's see......
Changed "Keep monitor history data" from 180 days to 7 days. Still, I installed it about 3 weeks ago at most.
I see a very similar error, running in docker on my Synology NAS:
2023-12-30T12:49:44+01:00 [MONITOR] INFO: Try to restart the monitor
Trace: KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
at Client_SQLite3.acquireConnection (/app/node_modules/knex/lib/client.js:312:26)
at async Runner.ensureConnection (/app/node_modules/knex/lib/execution/runner.js:287:28)
at async Runner.run (/app/node_modules/knex/lib/execution/runner.js:30:19)
at async RedBeanNode.normalizeRaw (/app/node_modules/redbean-node/dist/redbean-node.js:572:22)
at async RedBeanNode.getRow (/app/node_modules/redbean-node/dist/redbean-node.js:558:22)
at async RedBeanNode.getCell (/app/node_modules/redbean-node/dist/redbean-node.js:593:19)
at async Settings.get (/app/server/settings.js:54:21)
at async exports.setting (/app/server/util-server.js:610:12)
at async /app/server/server.js:199:13 {
sql: 'SELECT `value` FROM setting WHERE `key` = ? limit ?',
bindings: [ 'trustProxy', 1 ]
}
at process.unexpectedErrorHandler (/app/server/server.js:1899:13)
at process.emit (node:events:517:28)
at emit (node:internal/process/promises:149:20)
at processPromiseRejections (node:internal/process/promises:283:27)
at process.processTicksAndRejections (node:internal/process/task_queues:96:32)
If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues
Piling on: I have also been seeing this issue for over a year when I am moving around the app a lot (looking across monitors, editing the dashboards) but it usually self-resolves after a couple minutes of letting things catch up. Today it has not resolved and lead me to this issue.
Edit 1: Adding current DB size stats via the backups before clearing out the heartbeat table as suggested by @lnagel
507M Feb 11 21:55 kuma.db
60K Oct 17 2021 kuma.db.bak0
6.6M Oct 26 2021 kuma.db.bak20211026165215
10M Oct 31 2021 kuma.db.bak20211031220900
115M Jan 3 2022 kuma.db.bak20220103141319
211M May 27 2022 kuma.db.bak20220527215144
211M Jul 9 2022 kuma.db.bak20220709110215
211M Sep 11 2022 kuma.db.bak20220911174428
410M Dec 31 2022 kuma.db.bak20221231132150
452M Feb 18 2023 kuma.db.bak20230218182325
484M Mar 22 2023 kuma.db.bak20230322220702
Edit 2:
The truncate was taking longer than expected so I killed it and did a select count(*) from heartbeat;
and there were still 3.23M records in the table. I have kicked it off again but that does seem awfully high for 12 monitors with 2-5m checks.
Edit 3:
Yep, the truncate worked and everything about the app is snappy again. Post-vacuum
the table is also much more reasonable:
568K Feb 11 22:33 kuma.db
We implemented incremental_vacuum in 1.23 which should have mitigated this issue. Can you check if you are running the latest version, and if so, are there items in the logs that indicate the incremental_vacuum task has failed?
@chakflying just confirmed, 1.23.11. I have docker image update monitors, so I would have updated within days to a week from release.
I think the problem is less about vacuum
and more that the heartbeat table was huge at >3 million records for a small number of monitors.
What is your retention time in the settings set to?
@CommanderStorm Took me a few to find that as I'd never changed it. It was set to the default (I assume) of 180 days. I set it to 14 days just now to hopefully avoid this issue again
Okay. I am assuming your issue is the same @lnagel
We know that lowering retention is not a good solution for the long term, but for 1.23.X
that is everything that we can offer as a quick "remedy".
A lot of performance improvements (using aggregated vs non-aggregated tables to store heartbeats, enabling users to choose mariadb as a db-backend, pagination of important events) have been made in v2.0
(our next release) resolvingβ’οΈ this problem-area.
=> I'm going to close this issue
You can subscribe to our releases and get notified when a new release (such as v2.0-beta.0
) gets made.
See https://github.com/louislam/uptime-kuma/pull/4171 for the bugs that need addressing before that can happen.
Meanwhile (the issue is with SQLite not reading data fast enough to keep up):
β οΈ Please verify that this bug has NOT been raised before.
π‘οΈ Security Policy
Description
Startup crash with Trace: KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
π Reproduction steps
Start container with Uptime Kuma, log in and wait for data to load on the dashboard. Check logs for errors.
π Expected behavior
Dashboard would load, monitor checks would be run, log not full of errors.
π Actual Behavior
Dashboards do not load any data, monitor checks are not being run, log is full of errors.
π» Uptime-Kuma Version
1.23.7
π» Operating System and Arch
MikroTik RouterOS 7.11.2
π Browser
Firefox latest
π Docker Version
No response
π© NodeJS Version
18.18.2
π Relevant log output