Open tbertels opened 2 years ago
Would using the much simpler script at https://github.com/hunleyd/btrfs-auto-snapshot as a base make it any easier to debug?
I vaguely remember the problem, but it was more with backuphome at the time. That was about two and a half years ago, I still have a test procedure for it. The error comes from the fact that in very rare cases the procedure that is responsible for the progress display returns an error (why this is so I can unfortunately not say). in the backuphome function this was corrected, but unfortunately this seems to be forgotten in the xbiancopy function. With me this has never occurred all the years, neither with backuphome nor with xbiancopy.
Anyway you can try to bring the following change in xbian-storager and test it again
diff --git a/content/usr/local/sbin/xbian-storager b/content/usr/local/sbin/xbian-storager
index beda774..5eedb10 100755
--- a/content/usr/local/sbin/xbian-storager
+++ b/content/usr/local/sbin/xbian-storager
@@ -1403,7 +1403,7 @@ EOF
fi
}
ar=0
- ( copy_volume $opt_cpmode ) 2>&1 | \
+ ( copy_volume $opt_cpmode ) 2>&1 | { \
while read a; do
if [ -n "${a##*[!0-9]*}" ]; then
(( $a < $ar )) && a=100
@@ -1414,7 +1414,8 @@ EOF
fi
ar=$a
fi
- done
+ done || :
+ }
[ "$?" -eq 0 ] || { [ "$BACKGROUND" == 1 ] && printf "%s\n" "$wmsg subvolume $v snapshot $s failed"; exit $E_COPY_DATA; }
if [ "$opt_fstype" = btrfs ]; then
if [ "$opt_cpmode" = tar ]; then
If this works for you without errors, i will apply the patch
No luck with that change. :( What should I remove to try without that progress display procedure? Or is that change supposed to do that? By the way, the error happens when starting the system backup manually through the GUI. I've added a daily automatic system backup to see if that works.
No luck with that change. :(
Strange
What should I remove to try without that progress display procedure? Or is that change supposed to do that?
Something like this:
( copy_volume $opt_cpmode ) >/dev/null 2>&1
[ "$?" -eq 0 ] || { [ "$BACKGROUND" == 1 ] && printf "%s\n" "$wmsg subvolume $v snapshot $s failed"; exit $E_COPY_DATA; }
By the way, the error happens when starting the system backup manually through the GUI. I've added a daily automatic system backup to see if that works.
This should make no difference, as it runs in the background in both cases
So that confirms it: the progress display procedure isn't the cause. Without that change, no progress display is shown but still the same error:
Done processing subvolume modules
Create subvolume '/run/root-dest/root'
Create a readonly snapshot of '/run/xbian-storage/root/@' in '/run/xbian-storage/root/@.ro'
Cloning subvolume root snapshot @ failed
Operation failed with error E_COPY_DATA!
I tried a sudo btrfs scrub start / just in case, but it reported no error.
So I guess the error comes from the function copy_volume. Is there a command that I can try to check that the btrfs send | btrfs receive works as intended (and maybe get a more detailed error)?
Here is my next patch:
diff --git a/content/usr/local/sbin/xbian-storager b/content/usr/local/sbin/xbian-storager
index beda774..d0c9b66 100755
--- a/content/usr/local/sbin/xbian-storager
+++ b/content/usr/local/sbin/xbian-storager
@@ -1396,11 +1396,12 @@ EOF
to='--warning=none --acls --xattrs'
nice -n 10 tar c $to -C ${vs}/$s.ro . | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 tar x $to -C "${vd}/$s"
else
- nice -n 10 btrfs send "${vs}/$s.ro" | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 btrfs receive "${vd}"
+ nice -n 10 btrfs send 2>/tmp/btrfs.send.log "${vs}/$s.ro" | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 btrfs receive 2>/tmp/btrfs.receive.log "${vd}"
fi
else
nice -n 10 zfs send --props $v@ro | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 zfs receive -u "$opt_clonename/${v#*/}"
fi
+ echo "ps=\"${PIPESTATUS[*]}\"" >/run/ps.xbiancopy
}
ar=0
( copy_volume $opt_cpmode ) 2>&1 | \
@@ -1415,7 +1416,14 @@ EOF
ar=$a
fi
done
- [ "$?" -eq 0 ] || { [ "$BACKGROUND" == 1 ] && printf "%s\n" "$wmsg subvolume $v snapshot $s failed"; exit $E_COPY_DATA; }
+ . /run/ps.xbiancopy && rm -f /run/ps.xbiancopy
+ for RC in $ps; do
+ if [ "$RC" -ne 0 ]; then
+ print_log error "error(s) happened while copying data ($ps)"
+ break
+ fi
+ done
+ [ "$RC" -eq 0 ] || { [ "$BACKGROUND" == 1 ] && printf "%s\n" "$wmsg subvolume $v snapshot $s failed (ps=$ps)"; exit $E_COPY_DATA; }
if [ "$opt_fstype" = btrfs ]; then
if [ "$opt_cpmode" = tar ]; then
eval $opt_remotecmd nice -n 10 btrfs fi sync "${vd}/$s" && sync && btrfs sub delete "${vs}/$s.ro" || exit $E_SUB_DELETE
Reverted the previous patch. Now, in case of error the pipe return values are in the error message (ps=X X X)
And, for debugging btrfs send and btrfs receive prints output into file /tmp/btrfs.send.log resp /tmp/btrfs.receive.log
But I can't say if there are reasonable error messages in case of an error. I never had any errors there before
Nice! I think the problem seems obvious now:
xbiancopy.log:
Create subvolume '/run/root-dest/root'
Create a readonly snapshot of '/run/xbian-storage/root/@' in '/run/xbian-storage/root/@.ro'
Cloning root/@, 241138429 of 1148278236, 21% done
Cloning root/@, 528207988 of 1148278236, 46% done
Cloning root/@, 895657024 of 1148278236, 78% done
Cloning root/@, 1136795453 of 1148278236, 99% done
Error: error(s) happened while copying data (141 0 1)
Cloning subvolume root snapshot @ failed (ps=141 0 1)
Operation failed with error E_COPY_DATA!
btrfs.send.log:
At subvol /run/xbian-storage/root/@.ro
btrfs.receive.log:
At subvol @.ro
ERROR: writing to var/cache/apt/srcpkgcache.bin failed: No space left on device
Here's the result of df -h:
Filesystem Size Used Avail Use% Mounted on
udev 1.6G 4.0K 1.6G 1% /dev
tmpfs 360M 1.6M 358M 1% /run
/dev/mmcblk0p2 30G 2.3G 27G 8% /
/dev/mmcblk0p2 30G 2.3G 27G 8% /lib/modules
/dev/mmcblk0p2 30G 2.3G 27G 8% /home
/dev/mmcblk0p2 30G 2.3G 27G 8% /xbmc-backup
/dev/mmcblk0p1 100M 46M 55M 46% /boot
none 4.0K 0 4.0K 0% /sys/fs/cgroup
That is not yet obvious to me Either the image file is created too small or no space available on the NFS share
How big is the image file? Should have at least 2.3GB (minus used space by snapshots)
Here's what df- h returns just before the error:
Filesystem Size Used Avail Use% Mounted on
udev 1.6G 4.0K 1.6G 1% /dev
tmpfs 360M 1.6M 358M 1% /run
/dev/mmcblk0p2 30G 2.3G 27G 8% /
/dev/mmcblk0p2 30G 2.3G 27G 8% /lib/modules
/dev/mmcblk0p2 30G 2.3G 27G 8% /home
/dev/mmcblk0p2 30G 2.3G 27G 8% /xbmc-backup
/dev/mmcblk0p1 100M 46M 55M 46% /boot
none 4.0K 0 4.0K 0% /sys/fs/cgroup
/dev/mmcblk0p2 30G 2.3G 27G 8% /run/xbian-storage
192.168.1.5:/volume1/3 Tera Backup/Backup/Raspberry-backups/Bureau 2.7T 2.5T 251G 91% /run/kodimap
/dev/mapper/loop0p2 1.4G 1.3G 2.6M 100% /run/root-dest
And the usage of /dev/mapper/loop0p2 increases until the error.
Here's the full xbiancopy.log
As I understand it, /dev/mapper/loop0p2 is the snapshot, so it's normal that it gets full. I still wonder why I get "No space left on device" type errors in btrfs.receive.log:
ERROR: writing to usr/include/linux/pkt_sched.h failed: No space left on device
or
ERROR: writing to usr/include/linux/sysctl.h failed: No space left on device
Yes, /dev/mapper/loop0p2 is the container for the image. But as you can see it is practically full. So btrfs is probably right with its error message. The question now is, why is the container so tight for you. For me this is never the case
btrfs receive writes this error message because this command writes into the container. That's correct
Since it seems to need just a little more space, I tried enabling zlib compression, and it worked.
Here's the df -h just before the end:
Filesystem Size Used Avail Use% Mounted on
/dev/mmcblk0p2 30G 2.3G 27G 8% /
devtmpfs 1.6G 4.0K 1.6G 1% /dev
none 360M 1.5M 358M 1% /run
/dev/mmcblk0p2 30G 2.3G 27G 8% /lib/modules
/dev/mmcblk0p2 30G 2.3G 27G 8% /home
/dev/mmcblk0p1 100M 46M 55M 46% /boot
none 4.0K 0 4.0K 0% /sys/fs/cgroup
/dev/mmcblk0p2 30G 2.3G 27G 8% /run/xbian-storage
192.168.1.5:/volume1/3 Tera Backup/Backup/Raspberry-backups/Bureau 2.7T 2.5T 251G 91% /run/kodimap
/dev/mapper/loop0p2 1.4G 814M 500M 62% /run/root-dest
And the xbiancopy.log The btrfs.send.log and btrfs.receive.log contain no error.
The only question is how much, I checked today at noon, currently there is 20% more space than actually needed
You could add a startvalue for variable size in bytes as shown below
size=<startvalueinbytes>
for v in $FS_LOCAL_LIST; do
[ -n "$v" ] || continue
for s in $snapextra; do
sv=$(btrfs-auto-snapshot volsize "$v/$s")
size=$((size+sv))
done
done
this is how it looks with a backup on one of my Pi's
/dev/mapper/loop0p2 5921792 5269068 536660 91% /tmp/m
With size=51200000, I get
/dev/mapper/loop0p2 1515520 1378832 10992 100% /run/root-dest
and it works.
I have now made a kind of dynamic adjustment of the correction size. works here so far quite well
diff --git a/content/usr/local/sbin/xbian-storager b/content/usr/local/sbin/xbian-storager
index beda774..b2a5b13 100755
--- a/content/usr/local/sbin/xbian-storager
+++ b/content/usr/local/sbin/xbian-storager
@@ -45,6 +45,7 @@ opt_name=''
opt_exclude="XXXXXXXX"
opt_quiet=''
opt_size=''
+opt_sizeadj='300M'
# Default dynamcally set program options/variables for BTRFS/ZFS
@@ -1177,15 +1178,17 @@ EOF
exit $E_SAME_FS
fi
+ size=$(numfmt --from=iec $opt_sizeadj)
+ size_fix=105
if [ -z "$opt_size" ]; then
for v in $FS_LOCAL_LIST; do
[ -n "$v" ] || continue
for s in $snapextra; do
- sv=$(btrfs-auto-snapshot volsize "$v/$s")
+ sv=$(xbian-storager volsize "$v/$s")
size=$((size+sv))
done
done
- opt_size="$(($size*12/10/1024/1024))M"
+ opt_size="$(($size*$size_fix/100/1024/1024))M"
fi
size="$(numfmt --from=iec $opt_size)"
@@ -1365,6 +1368,33 @@ EOF
fi
fi
+ optimize_size () {
+ if [ "$opt_fstype" = btrfs ]; then
+ size=$(df -h | grep -m1 "$root_dest" | awk '{print $4}')
+ else
+ size=$(zfs list -Ho space $opt_clonename | awk '{print $2}')
+ fi
+ [ "$BACKGROUND" == 1 ] && printf "Space remaining in %s: %s (%s)\n" "$(basename "$fn")" $size $opt_sizeadj
+
+ size_avail=$(numfmt --from=iec $size)
+
+ opt_sizeadj=$(numfmt --from=iec $opt_sizeadj)
+ if [ "$size_avail" -gt "$(numfmt --from=iec 150M)" ]; then
+ opt_sizeadj=$((opt_sizeadj-50*1024*1024))
+ elif [ "$size_avail" -lt "$(numfmt --from=iec 50M)" ]; then
+ opt_sizeadj=$((opt_sizeadj+50*1024*1024))
+ else
+ opt_sizeadj=''
+ fi
+ if [ -n "$opt_sizeadj" ] && [ -e /etc/default/xbian-internals ]; then
+ if grep -q "OPT_SIZEADJ=" /etc/default/xbian-internals; then
+ sed -i "s/OPT_SIZEADJ=.*/OPT_SIZEADJ=$(numfmt --to=iec $opt_sizeadj)/g" /etc/default/xbian-internals
+ else
+ echo "OPT_SIZEADJ=$(numfmt --to=iec $opt_sizeadj)" >> /etc/default/xbian-internals
+ fi
+ fi
+ }
+
set -o pipefail
for v in $FS_LOCAL_LIST; do
if [ -z "$v" ] || echo "$v" | grep -qE "\.delete$|\.zlib$|\.lzo$|\.lz4$|\.zstd$"; then
@@ -1396,11 +1426,12 @@ EOF
to='--warning=none --acls --xattrs'
nice -n 10 tar c $to -C ${vs}/$s.ro . | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 tar x $to -C "${vd}/$s"
else
- nice -n 10 btrfs send "${vs}/$s.ro" | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 btrfs receive "${vd}"
+ nice -n 10 btrfs send 2>/tmp/btrfs.send.log "${vs}/$s.ro" | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 btrfs receive 2>/tmp/btrfs.receive.log "${vd}"
fi
else
nice -n 10 zfs send --props $v@ro | pv -i $ival -ns $siz | eval $opt_remotecmd nice -n 10 zfs receive -u "$opt_clonename/${v#*/}"
fi
+ echo "ps=\"${PIPESTATUS[*]}\"" >/run/ps.xbiancopy
}
ar=0
( copy_volume $opt_cpmode ) 2>&1 | \
@@ -1415,7 +1446,13 @@ EOF
ar=$a
fi
done
- [ "$?" -eq 0 ] || { [ "$BACKGROUND" == 1 ] && printf "%s\n" "$wmsg subvolume $v snapshot $s failed"; exit $E_COPY_DATA; }
+ . /run/ps.xbiancopy && rm -f /run/ps.xbiancopy
+ for RC in $ps; do
+ if [ "$RC" -ne 0 ]; then
+ optimize_size
+ [ "$BACKGROUND" == 1 ] && printf "%s\n" "$wmsg subvolume $v snapshot $s failed (ps=$ps)"; exit $E_COPY_DATA
+ fi
+ done
if [ "$opt_fstype" = btrfs ]; then
if [ "$opt_cpmode" = tar ]; then
eval $opt_remotecmd nice -n 10 btrfs fi sync "${vd}/$s" && sync && btrfs sub delete "${vs}/$s.ro" || exit $E_SUB_DELETE
@@ -1435,12 +1472,13 @@ EOF
[ "$BACKGROUND" == 1 ] && printf "%s\n" "Done processing subvolume $v"
done
+ optimize_size
+
if [ "$opt_fstype" = zfs ]; then
vdr="$(zpool get -H bootfs -o value "$opt_rpool" | sed "s%$opt_rpool%$opt_clonename%g")"
eval $opt_remotecmd zpool set bootfs=$vdr $opt_clonename
echo "vdr=$vdr" >>/run/vars.xbiancopy.$$
fi
-
if [ -n "$opt_recompress" ]; then
for v in $FS_LOCAL_LIST; do
if [ -z "$v" ] || echo "$v" | grep -qE "\.delete$|\.zlib$|\.lzo$|\.lz4$|\.zstd$"; then
@@ -1813,7 +1851,8 @@ esac
[ -e /etc/default/xbian-internals ] && . /etc/default/xbian-internals
[ -z "$OPT_COPY_MODE" ] || opt_cpmode="$OPT_COPY_MODE"
[ -z "$OPT_COMPRESS" ] || opt_compress="$OPT_COMPRESS"
-[ -z "$OPT_REMOTECMD" ] || opt_remotecmd="OPT_REMOTECMD"
+[ -z "$OPT_REMOTECMD" ] || opt_remotecmd="$OPT_REMOTECMD"
+[ -z "$OPT_SIZEADJ" ] || opt_sizeadj="$OPT_SIZEADJ"
if [ "$1" != '//' -a "${1%%-*}" != '' ]; then
opt_command="$1"
@@ -2165,7 +2204,7 @@ elif [ "$opt_command" = volsize ]; then
do_run mount -t btrfs $opt_dev $md 2>/dev/null
[ ! -d $md/$1 ] || printf "$(du -xsb $md/$1 | awk '{print $1}')\n"
else
- do_run zfs list -Hpo used ${1%%/@} 2>/dev/null
+ do_run zfs list -Hpo space ${1%%/@} 2>/dev/null | awk '{printf "%d\n", $5}'
fi
exit 0
elif [ "$opt_command" = fstype ]; then
Since it is nearly impossible to determine the exact size of the required image due to metadata and other imponderables, a dynamic adjustment is used. The value is set for each run so that the free space in the image is around 100M
I'm having a hard time merging those changes on Windows (I did it manually for the other ones). Could you attach the modified file?
This should be much easier:
1) download the patch 2) ssh into xbian 3) cd /usr/local/sbin 4) sudo patch -lp5 < /pathtopatch/downloadedpatch
Oh I forgot, of course restore the original state with sudo apt-get install --reinstall xbian-package-config-shell
first
Thanks, the patching works fine with that command, even on Windows (at least with the patch tool from Git, not the one from GnuWin32).
It works here too, with 77M (300M) remaining.
Great, so it seems that 300M start value will be a good choice.
It works fine too on an other rpi4, with 62M remaining there.
After a longer test and some corrections, this should now be fixed with this commit: https://github.com/xbianonpi/xbian-package-config-shell/commit/97e54ab916f73880b15042d9a084cd4b7fad3bf4
An adaptive size change, it should be future proof!
The btrfs-auto-snapshot.sh script (https://github.com/xbianonpi/xbian-package-config-shell/blob/master/content/usr/local/sbin/xbian-storager) stops with error E_COPY_DATA
Version: 5.10.82+ #1 SMP PREEMPT Wed Dec 1 19:48:08 CET 2021 armv7l XBian 20210911-0 - Bleeding Edge
This seems to be a known problem: http://forum.xbian.org/thread-4106-post-36879.html
Home backup works fine on the same NFS network share.
Log excerpt:
I can see the backup file being created on the network share before the error.
Would any other log help?