markdumay / synology-docker

An Unofficial Script to Update or Restore Docker Engine and Docker Compose on Synology
MIT License
185 stars 17 forks source link

[BUG] After update, Docker service doesn't start. #48

Open CardcaptorRLH85 opened 3 years ago

CardcaptorRLH85 commented 3 years ago

Describe the bug Exactly as it says in the title, the Docker service doesn't start after being updated with this script. Fortunately, everything still works if I restore the binaries from the backup but, the entire reason for me to use this script was to get a more up-to-date version of Docker due to the known issue of not being able to update environment variables with the version currently supplied by Synology.

Here is the specific error message:

Step 9 from 10: Starting Docker service
ERROR: Could not bring Docker Engine back online

To reproduce I ran the following command: sudo ./syno_docker_update.sh update and it failed at "Step 9 from 10: Starting Docker service".

Expected behavior I expected Docker to simply start running again after the update.

Log file

CardcaptorRLH85@Rin-san:~/synology-docker$ sudo ./syno_docker_update.sh update
Update Docker Engine and Docker Compose on Synology to target version

Current DSM version: 6.2.3
Current Docker version: 18.09.8
Current Docker Compose version: 1.24.0
Target Docker version: 20.10.2
Target Docker Compose version: 1.27.4

WARNING! This will replace:
  - Docker Engine
  - Docker Compose
  - Docker daemon log driver

Are you sure you want to continue? [y/N] y
Step 1 from 10: Downloading target Docker binary (https://download.docker.com/linux/static/stable/x86_64/docker-20.10.2.tgz)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 65.7M  100 65.7M    0     0  10.7M      0  0:00:06  0:00:06 --:--:-- 11.1M
Step 2 from 10: Downloading target Docker Compose binary (https://github.com/docker/compose/releases/download/1.27.4/docker-compose-Linux-x86_64)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   651  100   651    0     0   4247      0 --:--:-- --:--:-- --:--:--  4254
100 11.6M  100 11.6M    0     0  7434k      0  0:00:01  0:00:01 --:--:-- 8938k
Step 3 from 10: Stopping Docker service
pkgctl-Docker stoped.
Step 4 from 10: Backing up current Docker binaries (/volume1/homes/CardcaptorRLH85/synology-docker/docker_backup_20210116_160503.tgz)
bin/
bin/docker-init
bin/docker
bin/auplink
bin/ctr
bin/docker-compose
bin/runc
bin/containerd-shim
bin/containerd-shim-runc-v2
bin/containerd
bin/docker-proxy
bin/dockerd
dockerd.json
start-stop-status
Step 5 from 10: Extracting target Docker binary (/tmp/docker_update/docker-20.10.2.tgz)
docker/
docker/containerd
docker/docker-init
docker/ctr
docker/containerd-shim
docker/containerd-shim-runc-v2
docker/runc
docker/docker-proxy
docker/dockerd
docker/docker
Step 6 from 10: Installing binaries
Step 7 from 10: Configuring log driver
Step 8 from 10: Enabling IP forwarding
Step 9 from 10: Starting Docker service
ERROR: Could not bring Docker Engine back online
CardcaptorRLH85@Rin-san:~/synology-docker$

Docker daemon configuration

{
    "data-root" : "/var/packages/Docker/target/docker",
    "log-driver" : "json-file",
    "registry-mirrors" : [],
    "group": "administrators"
}

Additional context As I was going back through the steps to collect the data for this bug report, I decided to try installing Docker version 19.03.14 (currently the latest version of the 19.x branch) instead of the latest version (as of now 20.10.2). Unfortunately, that too failed in the same manner. So, I finally tried an upgrade to the most recent version of the 18.x branch, 18.09.9 which is only one revision (and just under two months) newer than the eighteen-month-old 18.09.8 version that Synology currently ships. However, that also failed in the same manner.

markdumay commented 3 years ago

Hi @CardcaptorRLH85, sorry to hear about your issues. Luckily you were able to restore the Docker engine to its original state. The log file and configuration all look normal to me, right until step 9 that is. A few thoughts and questions, if you're up for additional investigation.

  1. What filesystem does your NAS use?
  2. What does the original (backed up) version of your Docker daemon configuration look like?
  3. Does it help to manually shut down all Docker containers prior to the update?
  4. Does it help to manually start the Docker service when the script fails at step 9 (e.g. sudo synoservicectl --start pkgctl-Docker)?
CardcaptorRLH85 commented 3 years ago

I'd be glad to help out.

  1. I'm using Btrfs.
  2. Here's my dockerd.json backup.
    {
    "data-root" : "/var/packages/Docker/target/docker",
    "log-driver" : "db",
    "registry-mirrors" : [],
    "storage-driver" : "btrfs"
    }
  3. This worked!
  4. This is the first thing I tried and, unfortunately, it didn't help at all. I got an error code of 0 when I tried running that command.

When I tried shutting down all of my containers first just now, it worked. I probably should have tried that but it just never crossed my mind.

EDIT: It seems like there's a different issue now, unfortunately. When I try to start a container I see this message:

CardcaptorRLH85@Rin-san:~/synology-docker$ sudo docker start Portainer-CE
Password:
Error response from daemon: failed to initialize logging driver: failed to get logging factory: logger: no log driver named 'db' is registered: error looking up logging plugin db: plugin "db" not found
Error: failed to start containers: Portainer-CE
CardcaptorRLH85@Rin-san:~/synology-docker$ docker rename Portainer-CE portainer
CardcaptorRLH85@Rin-san:~/synology-docker$  sudo docker run -d -p 8000:8000 -p 9000:9000 --name=Portainer-CE --restart=always -v /var/run/docker.sock:/var/run/docker.sock -v /volume1/docker/Portainer-CE:/data portainer/portainer-ce
docker: Error response from daemon: Failed to create btrfs snapshot: inappropriate ioctl for device.
See 'docker run --help'.

Without looking into it I'm guessing that the db logging driver is a Synology-specific thing (after looking for just a couple of minutes I didn't see it mentioned in Docker's documentation anywhere) so, I'd understand if it doesn't work after an update. As my terminal output demonstrates, I then decided to rename the old container and recreate it from the command line. Unfortunately, I then got the docker: Error response from daemon: Failed to create btrfs snapshot: inappropriate ioctl for device. error.

Secondly, I've noticed that after this update all of the environment variables on my containers have been removed. I don't know what happened, they're just gone.

markdumay commented 3 years ago

Thanks for the follow up @CardcaptorRLH85.

I'm using Btrfs.

The btrfs support seems to questionable (see #22). My NAS uses ext4, so I haven't found a way to test this myself. Does below configuration work on your NAS? This is a fix that I'll address in #40.

{
   "data-root" : "/var/packages/Docker/target/docker",
   "log-driver" : "json-file",
   "registry-mirrors" : [],
   "storage-driver" : "btrfs"
}

Without looking into it I'm guessing that the db logging driver is a Synology-specific thing

The db logging driver is indeed a custom driver written by Synology. They use it as input for their GUI. I haven't found the source myself, reason why this repository replaces db with the default json-file driver. As an unwanted side-effect, Synology displays the wrong up time in the GUI unfortunately.

Secondly, I've noticed that after this update all of the environment variables on my containers have been removed.

I have heard about similar issues from other users. I haven't found a way to recreate the issue yet. For now the only thing I can recommend is to use Docker Compose instead. It allows you to script the entire container initiation, including environment variables. Hopefully it won't cost you to much effort to recreate your containers. I'll add this as a known issue too.

CardcaptorRLH85 commented 3 years ago

I'd just finished reading the Known Issues and was about to edit my comment again before you replied. Unfortunately, that change to my config doesn't fix things. Since this makes the third time that I'll be editing the variables for ~2 dozen containers in the last few days, I'll certainly take your Docker Compose advice after restoring to the Synology version of Docker again.

markdumay commented 3 years ago

The storage driver seems to be a tough one to fix. Docker's documentation provides some additional clues. If you have the time and patience, you could try alternative storage drivers instead. See below table for an overview.

Storage driver Supported backing filesystems
overlay2, overlay xfs with ftype=1, ext4
fuse-overlayfs any filesystem
aufs xfs, ext4
devicemapper direct-lvm
btrfs btrfs
zfs zfs
vfs any filesystem
I haven't tested this myself, but I'd be curious to know if any one of these storage drivers help. A warning from Docker: Important: When you change the storage driver, any existing images and containers become inaccessible. This is because their layers cannot be used by the new storage driver. If you revert your changes, you can access the old images and containers again, but any that you pulled or created using the new driver are then inaccessible.
draeron commented 3 years ago

I got a question, I'm trying to change my storage driver to overlay2, my backend is already a ext4 volume. The docker daemon fail to start but i can't figure out where the daemon log are located. The /var/log/Docker is empty.

Context: I'm to run the k3s agent on my synology so that it can act as a node for storage but it require an overlay storage driver.

markdumay commented 3 years ago

Does /var/log/upstart/pkg-Docker-dockerd.log provide any clues?

draeron commented 3 years ago

Oh thanks! Just checked, now I got something to investigate :

2021-04-07T18:33:08-0400 ERRO[2021-04-07T18:33:08.496803188-04:00] failed to mount overlay: no such device       storage-driver=overlay2

Same thing with overlay. After checking /proc/filesystems I realized the DS918+ doesn't seems to support overlayfs. Checked the /lib/modules and ran modprobe overlay but nothing there.

Kernel version is 4.4.59+ (DSM 6.2.3) which should include overlay from my understanding but they might have removed it from their distro.

I could try to install fuse-overlayfs.

markdumay commented 3 years ago

It's a pity that Synology does not support the overlay driver. They tend to heavily modify their distribution, so it doesn't come as a surprise.

I came across this article on Medium from Kristofer Lundgren, who used Docker in Docker instead of upgrading the Docker daemon. Might be worth investigating?

tevans62 commented 1 year ago

I also have this issue (doesn't start), but this is the error I see in: /var/log/upstart/pkg-Docker-dockerd.log

2023-02-25T08:28:34-0500 ERRO[2023-02-25T08:28:34.505536938-05:00] [graphdriver] prior storage driver aufs is deprecated and will be removed in a future release; update the the daemon configuration and explicitly choose this storage driver to continue using it; visit https://docs.docker .com/go/storage-driver/ for more information 2023-02-25T08:28:33-0500 INFO[2023-02-25T08:28:33.703362207-05:00] [core] [Channel #4] Channel Connectivity change to SHUTDOWN module =grpc 2023-02-25T08:28:33-0500 INFO[2023-02-25T08:28:33.704084712-05:00] [core] [Channel #4 SubChannel #5] Subchannel Connectivity change to SHUTD OWN module=grpc 2023-02-25T08:28:33-0500 INFO[2023-02-25T08:28:33.704700088-05:00] [core] [Channel #4 SubChannel #5] Subchannel deleted module=grpc 2023-02-25T08:28:33-0500 INFO[2023-02-25T08:28:33.705372381-05:00] [core] [Channel #4] Channel deleted module=grpc 2023-02-25T08:28:33-0500 INFO[2023-02-25T08:28:33.706815321-05:00] stopping event stream following graceful shutdown error="context

This is on an old DS712+ running DSM 6.2.4-25556 Update 6 So it looks like if the current storage driver is 'aufs' then for newer docker versions this either need to be explicit to 'overlay2'?

I may try this but a little nervous based on others attempts at changing storage drivers.

bokkoman commented 3 months ago

I also have this issue on a DS1815+ with DSM7.1.1.

bokkoman@FLOPPYDISK:~$ sudo systemctl start pkgctl-Docker
Job for pkgctl-Docker.service failed. See "systemctl status pkgctl-Docker.service" and "journalctl -xe" for details.
bokkoman@FLOPPYDISK:~$ systemctl status pkgctl-Docker.service
● pkgctl-Docker.service - Docker's service unit
   Loaded: loaded (/usr/local/lib/systemd/system/pkgctl-Docker.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2024-06-11 13:10:54 CEST; 28s ago
  Process: 18451 ExecStart=/bin/bash -c /usr/syno/sbin/synopkgctl start $SELF && /bin/touch /var/packages/$SELF/enabled (code=exited, status=1/FAILURE)
 Main PID: 18451 (code=exited, status=1/FAILURE)