Closed israsanc closed 9 months ago
What is the full Docker command to start the time machine? How much disk space is being used on the Docker host where your persistent data is?
I know it works with two macs as I back up two myself but let's narrow down the potential issues.
This is my service definition in the docker-compose yaml:
timemachine:
container_name: timemachine
image: mbentley/timemachine:smb-armv7l
hostname: timemachine
domainname: {my_domain}
mac_address: {random_mac_address}
networks:
macvlan:
ipv4_address: {local_ip}
environment:
- CUSTOM_SMB_CONF=false
- CUSTOM_USER=false
- DEBUG_LEVEL=1
- MIMIC_MODEL=TimeCapsule8,119
- EXTERNAL_CONF=
- HIDE_SHARES=no
- TM_USERNAME=timemachine
- TM_GROUPNAME=timemachine
- TM_UID=1000
- TM_GID=1000
- PASSWORD={my_password}
- SET_PERMISSIONS=false
- SHARE_NAME=TimeMachine
- SMB_INHERIT_PERMISSIONS=no
- SMB_NFS_ACES=yes
- SMB_METADATA=stream
- SMB_PORT=445
- SMB_VFS_OBJECTS=acl_xattr fruit streams_xattr
- VOLUME_SIZE_LIMIT=1 T
- WORKGROUP=WORKGROUP
volumes:
- ./timemachine-opt-timemachine:/opt/timemachine
- ./timemachine-var-lib-samba:/var/lib/samba
- ./timemachine-var-cache-samba:/var/cache/samba
- ./timemachine-run-samba:/run/samba
ports:
- 137:137/udp
- 138:138/udp
- 139:139
- 445:445
restart: unless-stopped
I'm using macvlan driver to avoid conflicts with avahi, and my filesystem is btrfs. My current backup uses only 92G (du says).
I think what you're hitting is related to what is being seen or at least was tempted to be worked around here: https://gitlab.com/artmg/samba/-/commit/b1714dbf74035550ff30494858e3d879c8d46003
Taking a look a the comment message in the diff:
/*
* Arithmetic on 32-bit systems may cause overflow, depending on
* size_t precision. First we check its unlikely, then we
* force the precision into target off_t, then we check that
* the total did not overflow either.
*/
Which would be 97911832576
and that converted to GiB (which is what it is measuring against, not GB) is 91.1875 GiB which matches what you're seeing on disk via du
. I am not much of a programmer and I don't have experience in C so I am not exactly sure what it is doing but it just seems to be failing on https://gitlab.com/samba-team/samba/-/blob/b0ba7cd4f96a6ea227943cb05ef51a463e292b2d/source3/modules/vfs_fruit.c#L4995-4999
Based on the output you provided:
bandsize [67108864] nbands [1459]
And then looking at the if
statement's math:
bandsize > SIZE_MAX/nbands
The actual math (I believe) should be:
67108864 > 1099511627776 / 1459
67108864 > 753606324
Which should return false
so it should never drop into that loop and output the message you're seeing if it wasn't overflowing as warned.
That seems odd to me. Could you get the contents of the smb.conf
that is generated inside your container? For example, mine:
# docker exec -it timemachine cat /etc/samba/smb.conf
[global]
access based share enum = no
hide unreadable = no
inherit permissions = no
load printers = no
log file = /var/log/samba/log.%m
logging = file
max log size = 1000
security = user
server min protocol = SMB2
server role = standalone server
smb ports = 445
workgroup = WORKGROUP
vfs objects = acl_xattr fruit streams_xattr
fruit:aapl = yes
fruit:nfs_aces = yes
fruit:model = TimeCapsule8,119
fruit:metadata = stream
fruit:veto_appledouble = no
fruit:posix_rename = yes
fruit:wipe_intentionally_left_blank_rfork = yes
fruit:delete_empty_adfiles = yes
[TimeMachine]
path = /opt/timemachine
inherit permissions = no
read only = no
valid users = timemachine
vfs objects = acl_xattr fruit streams_xattr
fruit:time machine = yes
fruit:time machine max size = 2 T
I want to make sure that it is setting fruit:time machine max size
as expected.
Thank you for your help. Using du without the human-readable switch says 95520632.
It seems you've found a good clue to follow. I'll investigate this myself as well.
My current smb.conf:
[global]
access based share enum = no
hide unreadable = no
inherit permissions = no
load printers = no
log file = /var/log/samba/log.%m
logging = file
max log size = 1000
security = user
server min protocol = SMB2
server role = standalone server
smb ports = 445
workgroup = WORKGROUP
vfs objects = acl_xattr fruit streams_xattr
fruit:aapl = yes
fruit:nfs_aces = yes
fruit:model = TimeCapsule8,119
fruit:metadata = stream
fruit:veto_appledouble = no
fruit:posix_rename = yes
fruit:wipe_intentionally_left_blank_rfork = yes
fruit:delete_empty_adfiles = yes
[TimeMachine]
path = /opt/timemachine
inherit permissions = no
read only = no
valid users = timemachine
vfs objects = acl_xattr fruit streams_xattr
fruit:time machine = yes
fruit:time machine max size = 1 T
Hmm yeah, it seems to be setting it correctly. I previously recall some strange compose behaviors with values that include spaces but on first glance, I see nothing that could be impacted here. I almost never use compose just due to how often I find myself fighting syntax issues instead of the actual problem I am solving so my memory there is a big fuzzy.
I have the same issue: I get the "tmsize potential overflow" error in the logs.
I'm using the armv7l
docker image with VOLUME_SIZE_LIMIT = 500G
I did also independently trace the issue down to the same issue in the samba repository that @mbentley pointed out. Samba had a fix applied on Mar. 3, 2020, and it is apparent the change is in there on the installed version since the diff shows the error message string changing from tmsize overflow
to tmsize potential overflow
. The issue persists however.
From what I can tell from the code in this commit, it would exit the function due to the return false;
so it never hits the modifications made in tm_size = (off_t)bandsize * (off_t)nbands;
. I am not sure if that is the intent - the change in the output makes it sound like it should be reporting a potential overflow but maybe doing some further check but I might just be misunderstanding because when looking at the original implementation here, it mentions it can't check for multiplication overflow on performing multiplication. I don't know enough about what exactly it is doing and why to understand and bring it up to someone who does know exactly.
I can confirm I am hitting the same issue running the armv7l
docker image with VOLUME_SIZE_LIMIT
set to 1 T
.
The error in the log is:
fruit_tmsize_do_dirent: tmsize potential overflow: bandsize [67108864] nbands [6372]
sys_disk_free: VFS disk_free failed. Error was : No error information
This error also prevents other clients to make a connection via Samba, you can mount the share but when you start browsing it via Finder is results in an 'network share is temporarily unavailable error'. (Might not be the exact error in English, it is translated from my local language).
My current workaround is to remove the VOLUME_SIZE_LIMIT
parameter from the configuration when starting the docker container. Then all is working as expected.
Looking at another image available, there might be another way to apply a limit: https://github.com/awlx/samba-timemachine/blob/main/entrypoint#L37
I'll have to look into the use of a .com.apple.TimeMachine.quota.plist
file as an alternative.
Hi @mbentley ,
Smells am running into very same issue:
fruit_tmsize_do_dirent: tmsize potential overflow: bandsize [8388608] nbands [2805]
sys_disk_free: VFS disk_free failed. Error was : Argument list too long
Limit is set to 1T too and it is during initial copy (migration) of existing time machine disk. Sparse initialize by adding new disk, once it started, cancelled, mounted the sparse "disk image" and started to copy over the source from HDD (time machine).
Plenty of these messages pop up continuously.
Environment - it's aarch64 with alpine:latest as of today (PRETTY_NAME="Alpine Linux v3.14"). HW side is:
model name : Amlogic S922X rev a
Hardware : Hardkernel ODROID-N2
Revision : 0400
Most probably it pulled armv7 and not armv8 as for other images unless I've forced it by arm64v8/alpine:latest then it was using armv7. Not sure anymore how to check on existing container.
Would you be able to assist how to overcome the problem or what might be consequences of leaving it like this? I wouldn't like to play with backup if something is odd on underlying fs.
Removing the quote doesn't seem to be a good idea here as it is same FS (ext4) which is used for other services hence quota needs to be enforced at software level. If not limited time machine will happily eat all space, won't it?
Could you please provide your docker run command or compose with credentials removed?
Thank you for great docker image!
Creds are in user file - but thanks for pointing it out. I use Dockerfile build as tried with arm64v8/alpine:latest, but need to wait for relative to kick-start copy of files again, hence can't confirm if on armv8 it would go through.
version: "3.7"
#https://github.com/mbentley/docker-timemachine
services:
timemachine:
image: local/timemachine:smb
build: ./docker-timemachine/
container_name: timecapsule
hostname: TimeCapsule
environment:
- CUSTOM_SMB_CONF=false
- CUSTOM_USER=false
- DEBUG_LEVEL=1
- EXTERNAL_CONF=/users
- HIDE_SHARES=no
- MIMIC_MODEL=TimeCapsule8,119
- TM_USERNAME=timemachine
- TM_GROUPNAME=timemachine
- TM_UID=1000
- TM_GID=1000
- PASSWORD=timemachine
- SET_PERMISSIONS=false
- SHARE_NAME=TimeMachine
- SMB_INHERIT_PERMISSIONS=no
- SMB_NFS_ACES=yes
- SMB_METADATA=stream
- SMB_PORT=445
- SMB_VFS_OBJECTS=acl_xattr fruit streams_xattr
- VOLUME_SIZE_LIMIT=0
- WORKGROUP=WORKGROUP
restart: unless-stopped
volumes:
- ${APPO}/timecapsule/users:/users:ro
- ${HDD}/home-timecapsule:/opt/
- ${APPO}/timecapsule/var-lib-samba:/var/lib/samba
- ${APPO}/timecapsule/var-cache-samba:/var/cache/samba
- ${APPO}/timecapsule/run-samba:/run/samba
ulimits:
nofile:
soft: 65536
hard: 65536
networks:
mvl_lan:
ipv4_address: 1.2.3.4
networks:
mvl_lan:
external: true
@mbentley - after moving to arm64v8/alpine:latest as base image - it kind of worked. Copy of the current backup stopped at around 400GB through (out of ~960GB) complaining about one of the files. New "issue" described here: https://github.com/mbentley/docker-timemachine/issues/105. Might be not an issue - just simple too slow HDD head movement.
When I try to use the same timemachine volume on a second MacBook Pro it fails and log shows
timemachine | fruit_tmsize_do_dirent: tmsize potential overflow: bandsize [67108864] nbands [1459] timemachine | sys_disk_free: VFS disk_free failed. Error was : No error information
This doesn't occur for the first backup/machine and doesn't occur if I don't set VOLUME_SIZE_LIMIT