nextcloud / all-in-one

📦 The official Nextcloud installation method. Provides easy deployment and maintenance with most features included in this one Nextcloud instance.
https://hub.docker.com/r/nextcloud/all-in-one
GNU Affero General Public License v3.0
5.07k stars 587 forks source link

Daily borg backup locked at creating archive #4994

Closed allista closed 1 month ago

allista commented 1 month ago

Discussed in https://github.com/nextcloud/all-in-one/discussions/4895

Originally posted by **Jycreyn** June 26, 2024 ### Steps to reproduce 1. Wait for daily backup ### Expected behavior Should backup, but after 10 hours, we have to kill it manually because we need to work .... ### Actual behavior It get stuck at "creating archive " ### Host OS debian 12 #### Nextcloud AIO version Nextcloud AIO v9.0.1 #### Current channel latest #### Other valuable info borg logs : Performing backup... Starting the backup... Killed stale lock nextcloud-aio-borgbackup.37-0. Removed stale exclusive roster lock for host nextcloud-aio-borgbackup pid 37 thread 0. Removed stale exclusive roster lock for host nextcloud-aio-borgbackup pid 37 thread 0. Killed stale lock nextcloud-aio-borgbackup.37-0. Removed stale exclusive roster lock for host nextcloud-aio-borgbackup pid 37 thread 0. Removed stale exclusive roster lock for host nextcloud-aio-borgbackup pid 37 thread 0. Creating archive at "/mnt/borgbackup/borg::20240625_220112-nextcloud-aio"

Note

This issue is already affecting multiple users with different setups, and thus cannot be considered just a question.

The proposed answer that the backup location is unavailable does not hold true in several cases.

Something is not right with automatic daily backups.

See the discussion thread attached at the top for details.

allista commented 1 month ago

Have the same issue. Master container logs:

2024-07-17T04:00:08.247331829Z Daily backup script has started
2024-07-17T04:00:08.444381276Z grep: write error: Broken pipe
2024-07-17T04:00:08.755255866Z Connection to nextcloud-aio-apache (172.21.0.12) 11000 port [tcp/*] succeeded!
2024-07-17T04:00:08.852545674Z Starting mastercontainer update...
2024-07-17T04:00:08.852587218Z (The script might get exited due to that. In order to update all the other containers correctly, you need to run this script with the same settings a second time.)
2024-07-17T04:00:17.356893648Z Waiting for watchtower to stop
2024-07-17T04:00:47.391816149Z Creating daily backup...
2024-07-17T04:01:29.076473986Z Waiting for backup container to stop
...
2024-07-17T05:26:34.888845852Z Waiting for backup container to stop
2024-07-17T05:27:04.922058143Z Waiting for backup container to stop
2024-07-17T05:27:34.952227499Z Waiting for backup container to stop

At this point master container continues to wait indefinitely.

Backup container logs are as follows:

2024-07-17T04:01:29.089733935Z Performing backup...
2024-07-17T04:01:29.090128879Z Starting the backup...
2024-07-17T04:01:32.187558779Z Creating archive at "/mnt/borgbackup/borg::20240717_040129-nextcloud-aio"

Backup location is on a local hard drive, always mounted. Manual backup goes without a problem.

szaimen commented 1 month ago

Hm... I fear I still cannot reproduce this on my test server.

Can you maybe post the output of sudo docker inspect nextcloud-aio-borgbackup here?

allista commented 1 month ago

Here you are, with password masked:

nextcloud-aio-borgbackup_inspect.json

szaimen commented 1 month ago

Thanks! Can you run sudo mount | grep /home and post the output here?

szaimen commented 1 month ago

Also, it looks like you are running docker as snap installation. Doing this is disrecommended by the docker maintainers afaik and might result in weird behaviours like this. I would really recommend the "normal", native install of docker. See https://docs.docker.com/engine/install/#supported-platforms or use the convenience script: curl -fsSL https://get.docker.com | sudo sh

allista commented 1 month ago

Mounts are as follows:

/dev/mapper/storage on /home/storage type ext4 (rw,noatime,nodiratime)
/dev/mapper/nextcloud-data_crypt on /home/storage/nextcloud/data type ext4 (rw,noatime,nodiratime,discard,stripe=8191)

So the data is a separate disk, while the the /home/storage/nextcloud/backup is just a directory on the main storage disk array, on which a lot of other services store their data.

Re-installing docker would wreak havoc, I'm afraid. I have quite a lot of containers running. Snap version was preinstalled with the ubuntu server, as far as I can remember, so it seems to be the recommended way, as far as Canonical is concerned that is :shrug:

szaimen commented 1 month ago
/dev/mapper/storage on /home/storage type ext4 (rw,noatime,nodiratime)
/dev/mapper/nextcloud-data_crypt on /home/storage/nextcloud/data type ext4 (rw,noatime,nodiratime,discard,stripe=8191)

So "Source": "/home/storage/nextcloud/data", goes to /dev/mapper/nextcloud-data_crypt and "Source": "/home/storage/nextcloud/backup", goes to /dev/mapper/storage IIRC? In that case /dev/mapper/nextcloud-data_crypt might be the culprit if it has some problems...

In any case I'll move this back to discussions since I cannot reproduce it. I'll be happy to help further if there is some proof that this is actually caused by AIO and not by the storage or docker configuration.