spantaleev / matrix-docker-ansible-deploy

🐳 Matrix (An open network for secure, decentralized communication) server setup using Ansible and Docker
GNU Affero General Public License v3.0
4.84k stars 1.04k forks source link

borgbackup No space left on device #2069

Open felixx9 opened 2 years ago

felixx9 commented 2 years ago

When running borgbackup (installed via the playbook) I get this log:

Aug 27 21:03:45 HOSTNAME systemd[1]: Starting Matrix Borg Backup...
Aug 27 21:03:50 HOSTNAME matrix-backup-borg[1241]: Remote: Warning: Permanently added 'borg.DOMAIN' (ED25519) to the list of known hosts.
Aug 27 21:05:10 HOSTNAME matrix-backup-borg[4553]: Remote: Warning: Permanently added 'borg.DOMAIN' (ED25519) to the list of known hosts.
Aug 27 21:06:24 HOSTNAME matrix-backup-borg[4553]: ------------------------------------------------------------------------------
Aug 27 21:06:24 HOSTNAME matrix-backup-borg[4553]:                        Original size      Compressed size    Deduplicated size
Aug 27 21:06:24 HOSTNAME matrix-backup-borg[4553]: Deleted data:                    0 B                  0 B                  0 B
Aug 27 21:06:24 HOSTNAME matrix-backup-borg[4553]: All archives:              189.43 GB            184.96 GB             61.44 GB
Aug 27 21:06:24 HOSTNAME matrix-backup-borg[4553]:                        Unique chunks         Total chunks
Aug 27 21:06:24 HOSTNAME matrix-backup-borg[4553]: Chunk index:                  228362               754278
Aug 27 21:06:24 HOSTNAME matrix-backup-borg[4553]: ------------------------------------------------------------------------------
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: /tmp/.borgmatic/postgresql_databases/matrix-postgres/synapse: read: [Errno 28] No space left on device: '/tmp/.cache/borg/aa203a7717d3abe4fe60d424157f749fafb5b7c84a29fbdb13d182bebc2d7a99/files' -> '/tmp/.cache/borg/aa203a7717d3abe4fe60d424157f749fafb5b7c84a29fbdb13d182bebc2d7a99/txn.tmp/files'
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: borg@borg.DOMAIN:/home/borg/my_servers/REPO: Error running actions for repository
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: Command 'pg_dump --no-password --clean --if-exists --host matrix-postgres --port 5432 --username matrix --format custom synapse > /tmp/.borgmatic/postgresql_databases/matrix-postgres/synapse' died with <Signals.SIGPIPE: 13>.
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: Error while creating a backup.
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: /etc/borgmatic.d/config.yaml: Error running configuration file
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: summary:
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: /etc/borgmatic.d/config.yaml: Error running configuration file
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: borg@borg.DOMAIN:/home/borg/my_servers/HS_DOMAIN.matrix: Error running actions for repository
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: Command 'pg_dump --no-password --clean --if-exists --host matrix-postgres --port 5432 --username matrix --format custom synapse > /tmp/.borgmatic/postgresql_databases/matrix-postgres/synapse' died with <Signals.SIGPIPE: 13>.
Aug 27 21:36:40 HOSTNAME matrix-backup-borg[4553]: Need some help? https://torsion.org/borgmatic/#issues
Aug 27 21:36:40 HOSTNAME systemd[1]: matrix-backup-borg.service: Main process exited, code=exited, status=1/FAILURE
Aug 27 21:36:40 HOSTNAME systemd[1]: matrix-backup-borg.service: Failed with result 'exit-code'.
Aug 27 21:36:40 HOSTNAME systemd[1]: Failed to start Matrix Borg Backup.

The backup was running for two weeks or so, w/o issues. Lots of space in SSD:

USER@HOSTNAME:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            5.9G     0  5.9G   0% /dev
tmpfs           1.2G  1.8M  1.2G   1% /run
/dev/vda3       314G  117G  185G  39% /
tmpfs           5.9G     0  5.9G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/vda2       974M  181M  726M  20% /boot
tmpfs           1.2G     0  1.2G   0% /run/user/0

what could cause this behavior?

felixx9 commented 2 years ago

(Was )working, since #2070 solved. Misconficuration on my side and (at least to me) misleading error message. Solved.

felixx9 commented 2 years ago

was working 2 or 3 times and since then again - "no space left on device". Could that be a problem resulting from Playbook configuration? Or do I need to dig in the container "shipper"?

smargold476 commented 2 years ago

Hey,

looks like I found some hints regarding your issue (I had the same before I got a new issue ...):

it looks like we need to replace the folder /tmp/.cache/borg/<ID>/chunks.archive.d/ with a file like these guys did it:

        # Fix too large Borg cache
        # https://borgbackup.readthedocs.io/en/stable/faq.html#the-borg-cache-eats-way-too-much-disk-space-what-can-i-do
        BORG_ID="$(borg config "$BORG_BACKUP_DIRECTORY" id)"
        rm -r "/root/.cache/borg/$BORG_ID/chunks.archive.d"
        touch "/root/.cache/borg/$BORG_ID/chunks.archive.d"

Infos from https://borgbackup.readthedocs.io

I guess fixing in the source-container is not possible, or? https://gitlab.com/etke.cc/borgmatic/-/blob/main/Dockerfile

Regards

felixx9 commented 2 years ago

ah interesting. But looks more like a workaround (?)

In a different context (don't remember, where) somebody suggested, to write the cache on a mapped volume instead of writing to the container's RAM which then fills up quickly. Sounds reasonable to me.

smargold476 commented 2 years ago

absolutely, but even the docs tells it is a log-term-workaround-hack but you are right -->

this is only recommended if you have a fast, low latency connection to your repo (e.g. if repo is local disk)

write the cache to a mapped volume would be much nicer.

UPDARTE: isn't it a volume already, but 100mb are not enought?

--tmpfs=/tmp:rw,noexec,nosuid,size=100m \

regards

aine-etke commented 1 year ago

Hey,

looks like I found some hints regarding your issue (I had the same before I got a new issue ...):

it looks like we need to replace the folder /tmp/.cache/borg/<ID>/chunks.archive.d/ with a file like these guys did it:

        # Fix too large Borg cache
        # https://borgbackup.readthedocs.io/en/stable/faq.html#the-borg-cache-eats-way-too-much-disk-space-what-can-i-do
        BORG_ID="$(borg config "$BORG_BACKUP_DIRECTORY" id)"
        rm -r "/root/.cache/borg/$BORG_ID/chunks.archive.d"
        touch "/root/.cache/borg/$BORG_ID/chunks.archive.d"

Infos from https://borgbackup.readthedocs.io

I guess fixing in the source-container is not possible, or? https://gitlab.com/etke.cc/borgmatic/-/blob/main/Dockerfile

Regards

You can always send Merge Request

absolutely, but even the docs tells it is a log-term-workaround-hack but you are right -->

this is only recommended if you have a fast, low latency connection to your repo (e.g. if repo is local disk)

write the cache to a mapped volume would be much nicer.

UPDARTE: isn't it a volume already, but 100mb are not enought?

--tmpfs=/tmp:rw,noexec,nosuid,size=100m \

regards

Same here, feel free to send PR