juanfont / headscale

An open source, self-hosted implementation of the Tailscale control server
BSD 3-Clause "New" or "Revised" License
23.37k stars 1.28k forks source link

[Bug] sqlite WAL never checkpointed, leading to constantly-increasing disk usage #2204

Open EtaoinWu opened 3 weeks ago

EtaoinWu commented 3 weeks ago

Is this a support request?

Is there an existing issue for this?

Current Behavior

I have a ~20 node tailnet running and my sqlite database is around 500KB. After running it for several months, I realized that it's almost occupying 1GB of disk space now. Upon inspection, I found that my db.sqlite-wal is growing at a steady 500KB/hour, which equals >4GB/year. Headscale never performs any checkpointing operation, and its wal_autocheckpoint was set to zero here. This lead to the ever-growing WAL file.

Expected Behavior

There should be some periodic checkpointing of the sqlite database, either by time or by number of transactions.

Steps To Reproduce

  1. Set up a tailnet with sqlite
  2. Add a bunch of nodes, keep them connected and let them ping each other periodically
  3. Wait for months
  4. See that db.sqlite-wal is huge

Environment

- OS: `Ubuntu 20.04.6 LTS`
- Headscale version: `v0.23.0`
- Tailscale version: `1.72.0`

Runtime environment

Anything else?

No response

aalmenar commented 3 weeks ago

In just a matter of two weeks, with personal usage, wal file is now 63MB from a fresh install.

nblock commented 3 weeks ago

A workaround to recover diskspace and write the content of the WAL back to the main database:


$ sudo systemctl stop headscale
$ sudo -u headscale -i
$ sqlite3 /path/to/headscale/db.sqlite VACUUM
$ sudo systemctl start headscale