Closed henfri closed 6 years ago
Read the FaQ about this?
On December 28, 2017 11:22:48 AM GMT+01:00, henfri notifications@github.com wrote:
I was wondering, why my backup took so long -although I ran a complete backup yesterday. Thus, I have increased the verbosity, and I see many (all?) files with the "A" flag. I did some reading and quickly found this:
So, I checked one of the files mtime with this little script:
filemtime=`stat -c %Y $1` currtime=`date +%s` diff=$(( (currtime - filemtime) / 86400 )) echo $diffroot
bash mtime.sh /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf 4371
So, this file is quite old. Nevertheless, I created a new file
touch /srv/DataPool/Dokumente/test.txt
TL;DR Skip to the bottom (marked again by TL;DR)
Still: Many many old files are getting the "A" flag.
The command I run is this:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-documents-{now} /srv/DataPool/Dokumente /srv/DataPool/Fotos /srv/DataPool/HomeVideo /srv/DataPool/Musik /srv/DataPool/Paperless /root /home /etc --exclude-from /etc/borgmatic/excludes --exclude-from /tmp/tmpgiptutgu --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
(line-breaks added for better readability)
borg lists gives me:
borg list borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup | grep documents homeserver-documents-2017-10-22T18:00:30 Sun, 2017-10-22 18:00:34 [2145c2bb752c426204eacf858ed7f4183c89de878b5781a91544f8e167b0538e] homeserver-documents-2017-10-27T20:35:37 Fri, 2017-10-27 20:35:41 [66ca5582e7284d42f20636b463148ca2ef890580b609f62dbcd2fcfb9ff3c0b2] homeserver-documents-2017-10-28T22:55:45 Sat, 2017-10-28 22:55:49 [7c7f66f60e77bf45acc02f08a9666a74f3d9f561cdc5bb90a10084d9a1678bf4] homeserver-documents-2017-12-27T12:38:44 Wed, 2017-12-27 12:38:53 [f71a36983813c9fd0c866f3e5fc175a45887315ce77b1eba09a174c66f3048dd] homeserver-documents-2017-12-27T13:02:24 Wed, 2017-12-27 13:02:33 [1cf7803186b9d851690770aab96abea404871b473841024bff404dc80035c65b]
Next, I do:
borg mount borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup /mnt/test cd /mnt/test/homeserver-documents-2017-12-27T13:02:24
and a ls -l of both files:
-rwxrwxrwx 1 henfri users 65616 Jan 8 2006 /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf -rwxrwxrwx 1 henfri users 65616 Jan 8 2006 ./srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
and the mtime of both files:
root@homeserver:/mnt/test/homeserver-documents-2017-12-27T13:02:24# bash ~/mtime.sh /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf 4371 root@homeserver:/mnt/test/homeserver-documents-2017-12-27T13:02:24# bash ~/mtime.sh ./srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf 4371
Also the md5sum of both files is identical:
40b7f91463627b9fa04bfe453ad5b69a /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf 40b7f91463627b9fa04bfe453ad5b69a ./srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
What am I missing?
I did now check another folder:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-etc /etc --exclude-from /etc/borgmatic/excludes --exclude-from /tmp/tmpgiptutgu --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
This backs up only /etc (which is also part of a previous backup). Here, many files are correctly identified as 'U'.Why does this work properly on /etc/ but not on /srv/DataPool/Dokumente/? Well, there is one speciality... DataPool is a mergerfs Pool.
So, I tested the example file in isolation:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-test1 /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf --exclude-from /etc/borgmatic/excludes --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
This time, I got a lot of these message -not sure why:
Fetching and building archive index for 'homeserver-2017-11-05-190701' ... Merging into master chunks index ... Reading cached archive chunk index for 'homeserver-2017-07-05-190702' ... Verified integrity of /root/.cache/borg/f9ddbe8a3610937e192017f636270c8c95254bcd767c9959c18cbfc9c85a6d0a/chunks.archive.d/b5e06201e90d2051a49da70a0e8bc9fa0624df0feae6b90073ec49e3f757f676.compact Merging into master chunks index ...
But nevertheless, again I got:
A /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
So next I try with the underlying filesystem (looking 'below' the mergerfs): (only the path to the file is changed: /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf)TL;DR just read from here: Summary: when backing up /etc/ files are correctly identified as unchanged. When backing up /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf (or other files on this file system) the files are recognized as 'A')
So now I try:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-test2 /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf --exclude-from /etc/borgmatic/excludes --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
No change: A /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
If I repeat the last command (just incrementing the archive name to test3) I still get "A".
So, I suspect it has something to do with the file system: Here the output of the mount command:
/dev/sdd on /media/678c601d-8d2e-4be0-9226-d20a6f772f8b type ext4 (rw,noexec,relatime,data=ordered,jqfmt=vfsv0,usrjquota=aquota.user,grpjquota=aquota.group)
What can be the reason?
Greetings, Hendrik
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/borgbackup/borg/issues/3492
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
@ThomasWaldmann Did you read the title of this issue? :-)
To answer your question: yes, I did! And the mtime of the file I use for the test is several years old.
Ehrm, no, obviously I only read the (github email) body.
Please specify the borg version. For mergerfs mount, use use_ino option.
Also, always use the same path for your mount. borg identifies files in the files cache by the hash of their full path. So if you change the mountpoint, the cached information won't be found.
Hello Thomas,
thanks for your help. The Borg-Version is 1.1.2. I have not changed the path. I have not used the use_ino option. I did add that now (fstab) and re-booted. The option is not shown as output of the mount command though (?)
Then I tried my test:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-test2 /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf --exclude-from /etc/borgmatic/excludes --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
Still and again I get
A /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
Do you have a further hint?
Greetings, Hendrik
You need to run 2 backups and look at the output of the second after adding the use_ino.
Hello Thomas,
that's what I did. To make sure, I did try that again. Still I get the "A" for the file that I have added now multiple times.
What's still odd though:
I have not changed the path. I have not used the use_ino option. I did add that now (fstab) and re-booted. The option is not shown as output of the mount command though (?)
Shouldn't mount list that this option was actually used?
Greetings, Hendrik
if the files-cache is expecting stable inodes (this is the default, see --files-cache) and your fs does not have stable inodes, "A" status all the time is expected.
So, why didn't you use use_ino? Or did you use it and is it not working? check with ls -i.
Hello Thomas,
I understand that.
I did not use use_ino in the past, because I was not aware. Now I do use it, but it is not displayed in the options according to https://github.com/trapexit/mergerfs/issues/460
Now, to ensure that it is no issue with mergerfs, I can try to backup the file from the underlying, actual (ext4) filesystem (in fact, I did that). Still the same issue.
ls -i shows in this case:
root@homeserver:/srv/DataPool# ls -i /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
30415538 /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
root@homeserver:/srv/DataPool# ls -i /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
30415538 /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
and in the mergerfs case:
root@homeserver:/srv/DataPool# ls -i /media/22cd0c22-c75b-432a-b92b-73229607a8bd/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
9002281867954 /media/22cd0c22-c75b-432a-b92b-73229607a8bd/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
root@homeserver:/srv/DataPool# ls -i /media/22cd0c22-c75b-432a-b92b-73229607a8bd/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
9002281867954 /media/22cd0c22-c75b-432a-b92b-73229607a8bd/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
So, to me it looks like the inodes are stable, no?
Greetings, Hendrik
I don't think you'll see instable inodes that way, but rather when there was some fs activitity in between AND a umount / mount.
Ok.
Is this an appropriate test?
# ls -i /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
30415538 /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
# touch /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/test
# umount -l /media/678c601d-8d2e-4be0-9226-d20a6f772f8b
# mount /media/678c601d-8d2e-4be0-9226-d20a6f772f8b
# ls -i /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
30415538 /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
P.S: can we determine the inode that borg saved for this file?
files_cache[H(fullpath)] -> file_entry (contains ctime, inode, size, chunkids)
Hm, I hoped for something I can do. C or C++ is not my world. Python would be. Do you have an alternative to progress on this?
Thanks for your help!
That stuff is used from python.
Progress would be "get your inodes working or ignore them".
Hello Thomas,
I am suprised that you are so sure that it is caused by the inodes. Especially, because the problem appared also on the underlying filesystem (i.e. without mergerfs).
I have now tried ignoring the inodes -on an ext4 drive. Same problem. Thus, I figured it is due to the mtime (size does not change). And in fact: As I was always only backing um this one(!) file, it was always automatically the newest. Backing up one folder, it turned out that everything works as expected:
--files-cache=mtime,size
U /root/testdir/borg-linux64
A /root/testdir/abba
U /root/testdir/borg_mount.sh
U /root/testdir/borgmount.sh
U /root/testdir/certbot.log
d /root/testdir
The newest file is Added, all others are unchanged.
As a conclusion: I had two problems: 1) use_inode was missing in mergerfs 2) my test after use_inode was not a good test, as it was backing up one file only. And if borg backs up one file only, it will always be "A". But not due to inode but due to mtime.
Greetings, Hendrik
If there is an option use_ino, but you do not use it (initially), it was quite clear that this breaks the unchanged file detection. Later you use it, but it seemed to have issues as you have found.
Your very early said you have read the FAQ about "A" status, so not sure why this took so long.
Because we focused on inode, not on mtime.
I'm not familiar with borg but if its using mtime of directories in order to decide anything you need to be sure to set func.getattr=newest
when using mergerfs else the data returned for a stat
will be whatever the first drive listed (by default).
Thanks for that advice. I was not aware!
I am not sure I understand the feature fully, though. Could you please elaborate on this?
Remember though that this is just stat. If the file is later open'ed or unlink'ed and the policy is different for those then a completely different file or directory could be acted on.
Greetings, Hendrik
You have to read the section on policies. Each filesystem function can have independent policies. Generally you want them all the same but some cases its useful. All that line is saying is that... remember you can have different policies for different functions. Just because you set getattr = newest doesn't mean that open or unlink will behave the same unless you use the same policy.
mergerfs is providing a union. You can have multiple files or directories at the same relative path. If getattr = newest then it means when you stat the data for the newest file/dir is returned. But if you have open = ff then you'd open the first file found based on the list of drives.
Hello,
I have added func.getattr=newest. All works like expected now. The backup of ~10Gb now takes 5 minutes over WAN. Brilliant!
Greetings, Hendrik
I was wondering, why my backup took so long -although I ran a complete backup yesterday. Thus, I have increased the verbosity, and I see many (all?) files with the "A" flag. I did some reading and quickly found this:
So, I checked one of the files mtime with this little script:
So, this file is quite old. Nevertheless, I created a new file
touch /srv/DataPool/Dokumente/test.txt
TL;DR Skip to the bottom (marked again by TL;DR)
Still: Many many old files are getting the "A" flag.
The command I run is this:
(line-breaks added for better readability)
borg lists gives me:
Next, I do:
and a ls -l of both files:
and the mtime of both files:
Also the md5sum of both files is identical:
What am I missing?
I did now check another folder:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-etc /etc --exclude-from /etc/borgmatic/excludes --exclude-from /tmp/tmpgiptutgu --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
This backs up only /etc (which is also part of a previous backup). Here, many files are correctly identified as 'U'.Why does this work properly on /etc/ but not on /srv/DataPool/Dokumente/? Well, there is one speciality... DataPool is a mergerfs Pool.
So, I tested the example file in isolation:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-test1 /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf --exclude-from /etc/borgmatic/excludes --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
This time, I got a lot of these message -not sure why:
But nevertheless, again I got:
A /srv/DataPool/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
So next I try with the underlying filesystem (looking 'below' the mergerfs): (only the path to the file is changed: /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf)TL;DR just read from here: Summary: when backing up /etc/ files are correctly identified as unchanged. When backing up /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf (or other files on this file system) the files are recognized as 'A')
So now I try:
borg create borg@openmediavault:/srv/_dev_disk_by-id_ata-ST5000DM000-1FK178_W4J1K7R1/borgbackup1/borgbackup::{hostname}-test2 /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf --exclude-from /etc/borgmatic/excludes --exclude-caches --exclude-if-present .nobackup --compression lz4 --one-file-system --remote-path borg --umask 77 --debug --list --stats
No change: A /media/678c601d-8d2e-4be0-9226-d20a6f772f8b/Dokumente/Hendrik/Dokumente/sioux_city1.pdf
If I repeat the last command (just incrementing the archive name to test3) I still get "A".
So, I suspect it has something to do with the file system: Here the output of the mount command:
/dev/sdd on /media/678c601d-8d2e-4be0-9226-d20a6f772f8b type ext4 (rw,noexec,relatime,data=ordered,jqfmt=vfsv0,usrjquota=aquota.user,grpjquota=aquota.group)
What can be the reason?
Greetings, Hendrik