Closed hollie closed 3 years ago
I would say that checking /var/log/message
and the output of dmesg
could also lead to any additional clues as to what is going on.
Hey Matt,
you have a point, a relevant entry from /var/log/messages from the time after the last successful backup worked was:
Apr 1 22:30:10 pi03 kernel: [ 5439.278011] sd 0:0:0:0: [sda] tag#6 uas_eh_abort_handler 0 uas-tag 12 inflight: OUT
Apr 1 22:30:10 pi03 kernel: [ 5439.278033] sd 0:0:0:0: [sda] tag#6 CDB: opcode=0x8a 8a 00 00 00 00 00 c4 34 80 00 00 00 01 88 00 00
Apr 1 22:30:10 pi03 kernel: [ 5439.278405] sd 0:0:0:0: [sda] tag#5 uas_eh_abort_handler 0 uas-tag 11 inflight: OUT
Apr 1 22:30:10 pi03 kernel: [ 5439.278422] sd 0:0:0:0: [sda] tag#5 CDB: opcode=0x8a 8a 00 00 00 00 00 c5 23 ef e0 00 00 00 30 00 00
Apr 1 22:30:12 pi03 kernel: [ 5441.118027] sd 0:0:0:0: [sda] tag#9 uas_eh_abort_handler 0 uas-tag 1 inflight: OUT
Apr 1 22:30:12 pi03 kernel: [ 5441.118047] sd 0:0:0:0: [sda] tag#9 CDB: opcode=0x8a 8a 00 00 00 00 00 c4 34 7d 88 00 00 02 78 00 00
Apr 1 22:30:12 pi03 kernel: [ 5441.158050] scsi host0: uas_eh_device_reset_handler start
I'll report back if I can let my setup run for a few days without issues without idling the drive. Then I can continue testing with the hd-idle re-enabled.
Hollie.
Just as an update: after removing the hd-idle script the drive keeps spinning (of course) and no longer results in the folder being unaccessible from within the docker container. So in the end the issue was not related to the timemachine docker image itself but due to an effect that the mounted drive becomes unavailable to docker as a whole after spinning up and down between active and standby state.
The drive is a LaCie d2 Professional USB 3.1-C 8TB
running the latest firmware according to the firmware updater tool (04/2021).
I hope this helps somebody else who is experiencing the same issue.
Share appears empty after running the container for a while
Note: this is a FYI notification of an issue that I am experiencing with my specific setup. The goal is to document what I see and hopefully to be able to resolve the issue. I am quite sure that docker-timemachine is not the cause of the issue but the fact the issue pops up is quite annoying when using the docker-timemachine specifically because it causes the backup process to fail. Which is not the idea of course 😄
I am running the latest
mbentley/timemachine:smb-armv7l
on a RPi4. I mount a USB hard drive with different partitions for every client under the/media
folder on the RPi.Then I have the docker image configured to serve each partition as a separate network share to different clients that I want to backup. I use this configuration to ensure that the different backup images don't compete with each other for space.
The hard drive is configured to go to sleep after a period of inactivity on the host using
hd-idle
.What happens is that after some time of the container running the timemachine services on a single client starts complaining that it cannot complete the backup process because the 'network backup drive does not support the required service'. (That last sentence is translated so the wording could be off. The exact phrase is 'de netwerkreservekopieschijf ondersteunt de vereiste voorzieningen niet').
Indeed, when I try to mount the network share from a client machine (either the machine that produced the error, or a different machine) the network share appears to be empty. When I try to create a file on the share it fails.
When I ssh into the server (the host) and check the mounted hard drive folder I can see the expected content. When I enter the docker image using
docker-compose exec timemachine /bin/ash
and navigate to the network share underopt
I see the folder is empty which matches with what SMD serves to the client. So it is logical that SMB in the container does not serve the files in the share anymore.The weird thing is that this only happens for a single share and that machines that backup up to other shares continue to work as expected. So I end up with a timemachine server that allows one client to backup and another one not.
My current thinking is that this is related to the hard drive going to sleep and docker not being able to recover a functional mapping of the hard drive mount point when the drive needs to wakeup from sleep.
I'll now first try to disable the HD sleep to see if this is a first workaround to keep the backup process on the multiple machines working as expected.
Finally I can add: restarting the docker container results in a working backup process on all clients for some time, and then after some time backup fails for another client with the same behaviour as described above.
So as a recap: my current line of thinking is that the hd-idle messes up with docker being able to mount the USB share point after running the container for a while. Probably this fails when the drive is spinning up, and then other machines that try to backup while the drive is still spinning are able to access the folder contents. If anybody has similar experiences or a solution I am happy to learn about it.
To Reproduce Steps to reproduce the behavior:
Expected behavior I expect the backup service to continue working without issues.
How you're launching your container
docker-compose.yml
smb.conf
Then the various partitions on the hard drive are mounted in
/etc/fstab
:Container Logs
Nothing out of the ordinary errors that also appear for the mount points that are working:
But those appear for the other mount points that are working too from time to time.