mbentley / docker-timemachine

Docker image to run Samba (compatible Time Machine for macOS)
Apache License 2.0
588 stars 67 forks source link

Time machine gets connected but waits on `Locating backup drive...` and eventually errors out #116

Closed vedhavyas closed 11 months ago

vedhavyas commented 2 years ago

Describe the bug

To Reproduce Steps to reproduce the behavior: Running with the following env

Expected behavior Once connected, timemachine starts backing up

How you're launching your container

timemachine:
    image: mbentley/timemachine:smb
    container_name: timemachine
    environment:
      - TM_USERNAME=${SMB_USER}
      - PASSWORD=${SMB_PASS}
      - SET_PERMISSIONS=true
      - TM_UID=${PUID}
      - TM_GID=${PGID}
      - SMB_INHERIT_PERMISSIONS=yes
      - DEBUG_LEVEL=3
      - WORKGROUP=Hub
    restart: unless-stopped
    networks:
      docker-direct:
        ipv4_address: 10.10.2.253
    volumes:
      - ${HUB_DIR}/time-machine:/opt/timemachine
    ulimits:
      nofile:
        soft: 65536
        hard: 65536

Container Logs

Registered MSG_REQ_POOL_USAGE
Allowed connection from 10.10.1.3 (10.10.1.3)
init_oplocks: initializing messages.
Transaction 0 of length 73 (0 toread)
switch message SMBnegprot (pid 66) conn 0x0
Requested protocol [NT LM 0.12]
Requested protocol [SMB 2.002]
Requested protocol [SMB 2.???]
Selected protocol SMB2_FF
GENSEC backend 'gssapi_spnego' registered
GENSEC backend 'gssapi_krb5' registered
GENSEC backend 'gssapi_krb5_sasl' registered
GENSEC backend 'spnego' registered
GENSEC backend 'schannel' registered
GENSEC backend 'naclrpc_as_system' registered
GENSEC backend 'sasl-EXTERNAL' registered
GENSEC backend 'ntlmssp' registered
GENSEC backend 'ntlmssp_resume_ccache' registered
GENSEC backend 'http_basic' registered
GENSEC backend 'http_ntlm' registered
GENSEC backend 'http_negotiate' registered
Selected protocol SMB 2.???
smbd_do_qfsinfo: level = 1003
Selected protocol SMB3_11
Allowed connection from 10.10.1.3 (10.10.1.3)
Selected protocol SMB3_11
Server exit (passed connection)
Registered MSG_REQ_POOL_USAGE
Allowed connection from 10.10.1.3 (10.10.1.3)
init_oplocks: initializing messages.
Transaction 0 of length 73 (0 toread)
switch message SMBnegprot (pid 67) conn 0x0
Requested protocol [NT LM 0.12]
Requested protocol [SMB 2.002]
Requested protocol [SMB 2.???]
Selected protocol SMB2_FF
GENSEC backend 'gssapi_spnego' registered
GENSEC backend 'gssapi_krb5' registered
GENSEC backend 'gssapi_krb5_sasl' registered
GENSEC backend 'spnego' registered
GENSEC backend 'schannel' registered
GENSEC backend 'naclrpc_as_system' registered
GENSEC backend 'sasl-EXTERNAL' registered
GENSEC backend 'ntlmssp' registered
GENSEC backend 'ntlmssp_resume_ccache' registered
GENSEC backend 'http_basic' registered
GENSEC backend 'http_ntlm' registered
GENSEC backend 'http_negotiate' registered
Selected protocol SMB 2.???
Selected protocol SMB3_11
Allowed connection from 10.10.1.3 (10.10.1.3)
Selected protocol SMB3_11
Server exit (passed connection)
smbd_do_qfsinfo: level = 1003
unix_mode(.) inherit mode 40770
unix_mode(.:AFP_AfpInfo) inherit mode 40770
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_create.c:337
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[5] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3323
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[9] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3323
smbd_do_qfsinfo: level = 1003
unix_mode(.) inherit mode 40770
unix_mode(.:AFP_AfpInfo) inherit mode 40770
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_create.c:337
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[5] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3323
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[9] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3323
send_local_master_announcement: type 849a03 for name 0E8441200715 on subnet 10.10.2.253 for workgroup HUB
send_workgroup_announcement: on subnet 10.10.2.253 for workgroup HUB
smbd_do_qfsinfo: level = 1003
unix_mode(.) inherit mode 40770
unix_mode(.:AFP_AfpInfo) inherit mode 40770
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_create.c:337
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[5] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3323
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[9] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3323

Additional context

mbentley commented 2 years ago

Hmm, not sure what would be going on. What filesystem are you using for your persistent data where ${HUB_DIR}/time-machine exists? Also what mount options are used for the filesystem (output of mount for the specific fs would work).

vedhavyas commented 2 years ago

Okay, so the HUB_DIR is rclone mount over sftp. That sftp is on ZFS filesystem.

I have also tried it on local disk thinking it would be mount issue but I get the same issue

mbentley commented 2 years ago

What was the local filesystem and mount options when testing that? Are you able to connect to the share manually (e.g. - using smb://10.10.2.253 in finder and able to drop a file on the share)?

For your ZFS filesystem that is backing the share, what options were used when the fs was created? For example, I believe there are issues if ACLs aren't enabled or supported (if you could post the properties, that might be helpful just for comparison https://github.com/mbentley/docker-timemachine/issues/101#issuecomment-940153119). Here are mine on ZFS for comparison.

Another place you might look is check out Console.app to see if there are any client errors while attempting to connect to the time machine backup. I know that's gonna be a pain as it is going to scroll a mile a minute but it could provide something useful.

vedhavyas commented 2 years ago

What was the local filesystem and mount options when testing that? Are you able to connect to the share manually (e.g. - using smb://10.10.2.253 in finder and able to drop a file on the share)?

yeap i was able to create a file and found a mistake as I was mounting wrong directory Fixed with Volumemount: ${HUB_DIR}/time-machine:/opt/${SMB_USER} Now I cannot create any file once smb folder after connection. Looks like permission issue

vedhavyas commented 2 years ago

For your ZFS filesystem that is backing the share, what options were used when the fs was created? For example, I believe there are issues if ACLs aren't enabled or supported (if you could post the properties, that might be helpful just for comparison https://github.com/mbentley/docker-timemachine/issues/101#issuecomment-940153119). Here are mine on ZFS for comparison.

I dont have have acces to those I think. I'm using rsyc.net if that give any more info

jjmulenex commented 1 year ago

Were you ever able to fix this? I am having the same issue after updating to the latest image

For context:

smbd_do_qfsinfo: level = 1003
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_create.c:340
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[5] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3331
smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[9] status[NT_STATUS_OBJECT_NAME_NOT_FOUND] || at ../../source3/smbd/smb2_server.c:3331
vedhavyas commented 1 year ago

I have not found a fix and did not proceed forward. I can try with latest image this week and report back if I still has the issue

mbentley commented 1 year ago

One thing that I might be able to do is try to adapt the current image from Alpine to Debian to see if switching helps seeing as at least the Debian packages will remain stable for a longer period of time. Maybe it something related to a version of an Alpine package as they are certainly more of a moving target than Debian packages are. It's just difficult as it is not easy to reproduce the issues.

mbentley commented 1 year ago

OK, I've created a multi-arch tag of images for amd64, arm64, and armv7l that can be tested to see if this works more reliably for anyone: https://hub.docker.com/r/mbentley/timemachine/tags?page=1&name=smb-debian

You should just be able to test with all of the same run commands or compose files, just swapping in mbentley/timemachine:smb-debian. I've only tested a backup from a brand new macOS Sonoma VM, which worked fine for me so if anyone could test, that would be great.

Just in case anyone wants to see the temporary changes to switch to Debian: https://github.com/mbentley/docker-timemachine/compare/master...debian-test. Still a lot more cleanup and validating that everything works as expected that needs to be done but it worked with a single user.

screenshot 2023-11-01 at 9 03 08 AM

jjmulenex commented 1 year ago

Thanks: I recreated the image with the new tag and was still not able to connect from Timemachine. It prepares the backup but failed in the end. I have included a truncated log maybe it will help. I can connect to the share from finder

https://pastebin.com/A4UEEjvf

mbentley commented 1 year ago

I went through the log files to see if there was anything that stuck out but I didn't see anything that sticks out as a smoking gun. I have no idea if it is going to be helpful but I found a post mentioning that you could get the time machine client logs using something like this from the terminal:

printf '\e[3J' && log show --predicate 'subsystem == "com.apple.TimeMachine"' --info --last 6h | grep -F 'eMac' | grep -Fv 'etat' | awk -F']' '{print substr($0,1,19), $NF}'
vedhavyas commented 1 year ago

I have tried with latest image again today and these are logs. I did fine while creating backup and failed after https://pastebin.com/nXL5KeCe

jjmulenex commented 1 year ago

I went through the log files to see if there was anything that stuck out but I didn't see anything that sticks out as a smoking gun. I have no idea if it is going to be helpful but I found a post mentioning that you could get the time machine client logs using something like this from the terminal:

printf '\e[3J' && log show --predicate 'subsystem == "com.apple.TimeMachine"' --info --last 6h | grep -F 'eMac' | grep -Fv 'etat' | awk -F']' '{print substr($0,1,19), $NF}'

Thank you @mbentley this was helpful. I was able to see there was a error with the local mount of my time machine backup volume

3-11-01 20:26:58  Found disk3s1 41504653-0000-11AA-AA11-00306543ECAC
2023-11-01 20:26:58  Found disk3s1 41504653-0000-11AA-AA11-00306543ECAC
2023-11-01 20:26:58  Found disk3s1 41504653-0000-11AA-AA11-00306543ECAC
2023-11-01 20:26:58  Found disk3s1 41504653-0000-11AA-AA11-00306543ECAC
2023-11-01 20:26:58  Found disk3s1 41504653-0000-11AA-AA11-00306543ECAC
2023-11-01 20:26:58  Attempting to mount APFS volume from disk3s1
2023-11-01 20:27:00  Found disk3s1 41504653-0000-11AA-AA11-00306543ECAC
2023-11-01 20:27:00  Mounted disk3s1 at '/Volumes/Backups of Kaylee (2)
2023-11-01 20:27:04  Invalid mountpoint '/Volumes/Backups of Kaylee (2)' - failed to open '/Volumes/Backups of Kaylee (2)/.iotest', error: 92 Illegal byte sequence
2023-11-01 20:27:04  Invalid mountpoint '/Volumes/Backups of Kaylee (2)' - failed to open '/Volumes/Backups of Kaylee (2)/.iotest', error: 92 Illegal byte sequence
2023-11-01 20:27:04  '/Volumes/.timemachine/timemachine._smb._tcp.local./405832EE-021A-4530-96B6-E85A9AC3DE54/TimeMachine/Kaylee (2).sparsebundle' mounted at '/Volumes/Backups of Kaylee (2)'
2023-11-01 20:27:04  Invalid mountpoint '/Volumes/Backups of Kaylee (2)' - failed to open '/Volumes/Backups of Kaylee (2)/.iotest', error: 92 Illegal byte sequence
2023-11-01 20:27:04  Backup failed (19: BACKUP_FAILED_TARGETVOL_NOT_MOUNTED - The backup disk could not be resolved, or there was a problem mounting it.)
2023-11-01 20:27:05  Unmounted '/Volumes/Backups of Kaylee (2)'
2023-11-01 20:27:07  Unmounted '/Volumes/.timemachine/timemachine._smb._tcp.local./405832EE-021A-4530-96B6-E85A9AC3DE54/TimeMachine'

I watched as time machine mounted the sparsebundle, scanned it, and gave the above error.

Next I recreated the container using your Debian version, Same result. I then moved the sparsebundle to a directory called old and re-ran the backup with no issues. I manually backed up the machine with the default smb version (new container) same sparcebundle with no issues.

Conclusion: there was a problem with the data somewhere and the backup was corrupted for me. I am using a drive that is formatted ext4 on a 2 disk RAID0

@mbentley Thanks for trying to help

mbentley commented 11 months ago

I have tried with latest image again today and these are logs. I did fine while creating backup and failed after https://pastebin.com/nXL5KeCe

I am not sure if NT_STATUS_NOT_SUPPORTED is just a warning or an error in context of time machine. Are you still having issues getting this working? If so, let me know and I will see what I can do.