nextcloud / all-in-one

📦 The official Nextcloud installation method. Provides easy deployment and maintenance with most features included in this one Nextcloud instance.
https://hub.docker.com/r/nextcloud/all-in-one
GNU Affero General Public License v3.0
4.72k stars 558 forks source link

Nexcloud down and daily backup stuck #4812

Closed marcello-dev closed 1 month ago

marcello-dev commented 1 month ago

This is a continuation of https://github.com/nextcloud/all-in-one/issues/4736 because the issue appeared again but this time a bit different. Details below.

Steps to reproduce

  1. Wait for the daily backup to complete

Expected behavior

The backup should start at 4am UTC and finish after a few minutes

Actual behavior

The backup is still ongoing at 2pm UTC (~10hours later)

Host OS

Ubuntu 22.04

Nextcloud AIO version

v8.3.0

Current channel

latest

Other valuable info

The Borg container is still running after 10 hours, when it usually takes only a few minutes. Logs from Borg container:

Performing backup...
Starting the backup...
Creating archive at "/mnt/borgbackup/borg::20240611_040124-nextcloud-aio"

AIO logs shows that it is waiting for the Borg container to stop (see logs below)

Daily backup script has started
grep: write error: Broken pipe
Connection to nextcloud-aio-apache (172.18.0.10) 11000 port [tcp/*] succeeded!
Starting mastercontainer update...
(The script might get exited due to that. In order to update all the other containers correctly, you need to run this script with the same settings a second time.)
Waiting for watchtower to stop
Creating daily backup...
Waiting for backup container to stop
Waiting for backup container to stop
Waiting for backup container to stop
Waiting for backup container to stop
Waiting for backup container to stop
.....
Waiting for backup container to stop
Waiting for backup container to stop
Deleting duplicate sessions
Waiting for backup container to stop
NOTICE: PHP message: 404 Not Found
Type: Slim\Exception\HttpNotFoundException

4
Message: Not found.
File: /var/www/docker-aio/php/vendor/slim/slim/Slim/Middleware/RoutingMiddleware.php
Line: 76
Trace: #0 /var/www/docker-aio/php/vendor/slim/slim/Slim/Routing/RouteRunner.php(56): Slim\Middleware\RoutingMiddleware->performRouting(Object(GuzzleHttp\Psr7\ServerRequest))
#1 /var/www/docker-aio/php/vendor/slim/csrf/src/Guard.php(482): Slim\Routing\RouteRunner->handle(Object(GuzzleHttp\Psr7\ServerRequest))
#2 /var/www/docker-aio/php/vendor/slim/slim/Slim/MiddlewareDispatcher.php(168): Slim\Csrf\Guard->process(Object(GuzzleHttp\Psr7\ServerRequest), Object(Slim\Routing\RouteRunner))
#3 /var/www/docker-aio/php/vendor/slim/twig-view/src/TwigMiddleware.php(115): Psr\Http\Server\RequestHandlerInterface@anonymous->handle(Object(GuzzleHttp\Psr7\ServerRequest))
#4 /var/www/docker-aio/php/vendor/slim/slim/Slim/MiddlewareDispatcher.php(121): Slim\Views\TwigMiddleware->process(Object(GuzzleHttp\Psr7\ServerRequest), Object(Psr\Http\Server\RequestHandlerInterface@anonymous))
#5 /var/www/docker-aio/php/src/Middleware/AuthMiddleware.php(38): Psr\Http\Server\RequestHandlerInterface@anonymous->handle(Object(GuzzleHttp\Psr7\ServerRequest))
#6 /var/www/docker-aio/php/vendor/slim/slim/Slim/MiddlewareDispatcher.php(269): AIO\Middleware\AuthMiddleware->__invoke(Object(GuzzleHttp\Psr7\ServerRequest), Object(Psr\Http\Server\RequestHandlerInterface@anonymous))
#7 /var/www/docker-aio/php/vendor/slim/slim/Slim/Middleware/ErrorMiddleware.php(76): Psr\Http\Server\RequestHandlerInterface@anonymous->handle(Object(GuzzleHttp\Psr7\ServerRequest))
#8 /var/www/docker-aio/php/vendor/slim/slim/Slim/MiddlewareDispatcher.php(121): Slim\Middleware\ErrorMiddleware->process(Object(GuzzleHttp\Psr7\ServerRequest), Object(Psr\Http\Server\RequestHandlerInterface@anonymous))
#9 /var/www/docker-aio/php/vendor/slim/slim/Slim/MiddlewareDispatcher.php(65): Psr\Http\Server\RequestHandlerInterface@anonymous->handle(Object(GuzzleHttp\Psr7\ServerRequest))
#10 /var/www/docker-aio/php/vendor/slim/slim/Slim/App.php(199): Slim\MiddlewareDispatcher->handle(Object(GuzzleHttp\Psr7\ServerRequest))
#11 /var/www/docker-aio/php/vendor/slim/slim/Slim/App.php(183): Slim\App->handle(Object(GuzzleHttp\Psr7\ServerRequest))
#12 /var/www/docker-aio/php/public/index.php(185): Slim\App->run()
#13 {main}
Tips: To display error details in HTTP response set "displayErrorDetails" to true in the ErrorHandler constructor.
Waiting for backup container to stop
Waiting for backup container to stop
Waiting for backup container to stop
...

I tried to stop manually Borg container and got:

Error response from daemon: cannot stop container: eafea67cfdc0: tried to kill container, but did not receive an exit event

I still don't have the feature to manually unblock the backup from the dashboard, because it was implemented for AIO v9. Also how to know if Borg is still doing something? I did upload some big files yesterday but it should not take that long.

New feature idea: it would be good to add a timestamp next to the log message for better troubleshooting.

szaimen commented 1 month ago

New feature idea: it would be good to add a timestamp next to the log message for better troubleshooting.

Good idea.

Keeping this here for later: Screenshot_20240611_232344_Brave

marcello-dev commented 1 month ago

Nexcloud is up again finally. I could not kill the Borg container, so I force restarted the server. On startup Ubuntu failed to load because the external drive (where the Nextcloud datadir is located) could not mount properly. Unplug and plug back in the external drive fixed the drive issue, then Ubuntu could start. The Borg container also started on startup but this time finished the backup correctly and Nextcloud came back online. I'd say, the backup getting stuck is due to my external drive playing games with me.

szaimen commented 1 month ago

I'd say, the backup getting stuck is due to my external drive playing games with me.

yes, I think so as well

szaimen commented 4 weeks ago

New feature idea: it would be good to add a timestamp next to the log message for better troubleshooting.

Good idea.

This is now released with v9.1.0 Beta. Testing and feedback is welcome! See https://github.com/nextcloud/all-in-one#how-to-switch-the-channel