mbentley / docker-timemachine

Docker image to run Samba (compatible Time Machine for macOS)
Apache License 2.0
527 stars 65 forks source link

[Bug]: dbus-daemon using 100% CPU #135

Closed marvinruder closed 1 year ago

marvinruder commented 1 year ago

Describe the Bug

After starting the container, I am able to connect to the SMB drive, which behaves normally, but on the server, the command dbus-daemon --system --nofork --no-syslog uses 100% CPU. Also, the log is constantly filled with dbus socket not yet available; sleeping... messages.

Expected Behavior

Normal CPU usage without constant “socket not yet available” logs

Steps to Reproduce

docker compose up timemachine

How You're Launching the Container

version: "3.8"

services:
  timemachine:
    image: mbentley/timemachine:smb
    restart: unless-stopped
    network_mode: host
    environment:
      ADVERTISED_HOSTNAME: "storage1.internal.mruder.dev"
      TM_USERNAME: "core"
      PASSWORD: "********"
      SHARE_NAME: "storage1"
    volumes:
      - /var/mnt/storage:/opt/core
      - /var/cache/timemachine:/var/cache/samba
      - ./timemachine/lib:/var/lib/samba
      - ./timemachine/run:/run/samba

Container Logs

dcservices-timemachine-1  | INFO: CUSTOM_SMB_CONF=false; generating [global] section of /etc/samba/smb.conf...
dcservices-timemachine-1  | INFO: Avahi - generating base configuration in /etc/avahi/services/smbd.service...
dcservices-timemachine-1  | INFO: Avahi - using storage1.internal.mruder.dev as hostname.
dcservices-timemachine-1  | INFO: Avahi - adding the 'dk0', 'storage1' share txt-record to /etc/avahi/services/smbd.service...
dcservices-timemachine-1  | INFO: Group timemachine exists; skipping creation
dcservices-timemachine-1  | INFO: User core exists; skipping creation
dcservices-timemachine-1  | INFO: CUSTOM_SMB_CONF=false; generating [storage1] section of /etc/samba/smb.conf...
dcservices-timemachine-1  | INFO: Samba - Created User core password set to none.
dcservices-timemachine-1  | INFO: Samba - Enabled user core.
dcservices-timemachine-1  | INFO: Samba - setting password
dcservices-timemachine-1  | INFO: SET_PERMISSIONS=false; not setting ownership and permissions for /opt/core
dcservices-timemachine-1  | INFO: Avahi - completing the configuration in /etc/avahi/services/smbd.service...
dcservices-timemachine-1  | INFO: running test for xattr support on your time machine persistent storage location...
dcservices-timemachine-1  | INFO: xattr test successful - your persistent data store supports xattrs
dcservices-timemachine-1  | INFO: entrypoint complete; executing 's6-svscan /etc/s6'
dcservices-timemachine-1  | dbus socket not yet available; sleeping...
dcservices-timemachine-1  | nmbd version 4.18.2 started.
dcservices-timemachine-1  | Copyright Andrew Tridgell and the Samba Team 1992-2023
dcservices-timemachine-1  | smbd version 4.18.2 started.
dcservices-timemachine-1  | Copyright Andrew Tridgell and the Samba Team 1992-2023
dcservices-timemachine-1  | INFO: Profiling support unavailable in this build.
dcservices-timemachine-1  | dbus socket not yet available; sleeping...
dcservices-timemachine-1  | dbus socket not yet available; sleeping...
dcservices-timemachine-1  | dbus socket not yet available; sleeping...
dcservices-timemachine-1  | dbus socket not yet available; sleeping...
dcservices-timemachine-1  | dbus socket not yet available; sleeping...
.....

Additional Context

No response

mbentley commented 1 year ago

Are you running the very latest image (have to pulled the image lately)? There was https://github.com/mbentley/docker-timemachine/issues/130 which was due to the upgrade of the alpine packages with the new version. That's the main thing I can think of is that avahi isn't able to start (and dbus isn't starting or isn't creating the socket) but that was also spamming out Failed to start message bus: Failed to bind socket messages too so there is a chance it isn't the same issue.

Are you running this on an amd64/x86_64 based machine or some other device like arm64? uname-m would tell you that.

If you exec into the container, check and see if the dbus pid and socket exists or not. On my system where it is running fine, it looks like:

$ docker exec -it timemachine ls -la /var/run/dbus/
total 4
drwxr-xr-x    1 root     root            47 Jun  4 16:10 .
drwxr-xr-x    1 root     root            38 May 31 08:26 ..
-rw-r--r--    1 root     root             3 Jun  4 16:10 dbus.pid
srwxrwxrwx    1 root     root             0 Jun  4 16:10 system_bus_socket

The output of docker images --digests --filter=reference='mbentley/timemachine' could help me as well. Should look something like:

$ docker images --digests --filter=reference='mbentley/timemachine'
REPOSITORY             TAG       DIGEST                                                                    IMAGE ID       CREATED      SIZE
mbentley/timemachine   latest    sha256:9e8764a667ed25731f1ff604604b4e74af53edcb72a7cb823cc201a94c9c5725   fffbb7555837   4 days ago   46.8MB
marvinruder commented 1 year ago

Thanks for the quick response!

I am running the latest image (digest sha256:9e8764a667ed25731f1ff604604b4e74af53edcb72a7cb823cc201a94c9c5725) on x86_64 architecture. The /var/run/dbus/ directory itself exists and is owned by root with 755 permissions, but is empty. I do not see messages like Failed to start message bus: Failed to bind socket, only the one I referenced above.

I also tried with smb-20230508 (one day before #130 apparently) and the issue is present there as well.

marvinruder commented 1 year ago

Had another close look at the configuration in #130 and added

    ulimits:
      nofile:
        soft: 65536
        hard: 65536

to mine, after which the log messages and the high CPU usage disappeared. I didn't find anything related in the image documentation itself, maybe it could be added there?

mbentley commented 1 year ago

No prob. How about this:

$ docker exec -it timemachine sh -c 'ls -la /var/run/*'
/var/run/avahi-daemon:
total 4
drwxr-xr-x    2 avahi    avahi           31 Jun  4 16:10 .
drwxr-xr-x    1 root     root            38 May 31 08:26 ..
-rw-r--r--    1 avahi    avahi            3 Jun  4 16:10 pid
srwxrwxrwx    1 avahi    avahi            0 Jun  4 16:10 socket

/var/run/dbus:
total 4
drwxr-xr-x    1 root     root            47 Jun  4 16:10 .
drwxr-xr-x    1 root     root            38 May 31 08:26 ..
-rw-r--r--    1 root     root             3 Jun  4 16:10 dbus.pid
srwxrwxrwx    1 root     root             0 Jun  4 16:10 system_bus_socket

/var/run/samba:
total 12
drwxr-xr-x    5 root     root           106 Jun  4 16:10 .
drwxr-xr-x    1 root     root            38 May 31 08:26 ..
drwxr-xr-x    3 root     root            31 Nov 28  2022 ncalrpc
drwxr-xr-x    2 root     root            24 Jun  4 16:10 nmbd
-rw-r--r--    1 root     root             3 Jun  4 16:10 nmbd.pid
-rw-r--r--    1 root     root             4 Jun  4 03:00 samba-dcerpcd.pid
-rw-r--r--    1 root     root             3 Jun  4 16:10 smbd.pid
drwxr-xr-x    2 root     root             6 Jan  2  2020 winbindd
mbentley commented 1 year ago

Oh well that's interesting. I actually don't know if I've specifically seen an issue as to when updating the ulimits would be required and specifically the symptoms. I didn't have the ulimits settings in the docker run command and I don't see where it was specifically added in the compose file to solve hitting the ulimits but it was added back in https://github.com/mbentley/docker-timemachine/commit/f3a952d41b7783384da0fe445794046eff01e6ff though.

marvinruder commented 1 year ago

Ah, there it is, yes. I only looked at the examples in the README. Perhaps a quick note there that ✨ some issues ✨ may be solved by changing ulimits might be helpful?