borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
10.84k stars 734 forks source link

extract: original mtime not preserved on files with ResourceFork #7234

Closed rxgh99 closed 1 year ago

rxgh99 commented 1 year ago

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

Bug. Originally logged under Vorta bug #939.

System information. For client/server mode post info for both machines.

Macbook Air, M1.

Your borg version (borg -V).

1.2.3

Operating system (distribution) and version.

macOS 12.6.2

Hardware / network configuration, and filesystems used.

Local APFS (encrypted)

How much data is handled by borg?

n/a

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg extract

Describe the problem you're observing.

Any file that has macOS xattr com.apple.ResourceFork is restored with a "modified" date equal to the date and time of the restore and not the original modified time of the file. Example below.

$ touch test1.txt  //create test file 1
$ touch test2.txt  //create test file 2
$ xattr -w com.apple.ResourceFork mytestrf test1.txt  //Add com.apple.ResourceFork to test1.txt
$ xattr -p com.apple.ResourceFork test1.txt           //Check xattr is added
mytestrf

$ ls -l test*.txt
-rw-r--r--@ 1 rxgh99  admin  0 Dec 29 00:32 test1.txt  <-- Original modified date/time of test1
-rw-r--r--  1 rxgh99  admin  0 Dec 29 00:31 test2.txt

$ borg create test-repo::xattr2 test*.txt

$ borg list test-repo::xattr2
-rw-r--r-- rxgh99   admin         0 Thu, 2022-12-29 00:32:03 test1.txt  <-- As captured in the repo
-rw-r--r-- rxgh99   admin         0 Thu, 2022-12-29 00:31:56 test2.txt

$ cd restore
$ borg extract ../test-repo::xattr2

$ date
Thu Dec 29 00:41:36 PST 2022

$ ls -l
-rw-r--r--@ 1 rxgh99  admin  0 Dec 29 00:41 test1.txt  <-- Modified date/time at restore
-rw-r--r--  1 rxgh99  admin  0 Dec 29 00:31 test2.txt

$ xattr -p com.apple.ResourceFork test1.txt  //Check xattr is restored
mytestrf

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Yes. Any restore of a file that includes com.apple.ResourceFork results in this behavior.

Include any warning/errors/backtraces from the system logs

No warnings or errors.

ThomasWaldmann commented 1 year ago

Thanks for this report. Does this happen also if only any other xattr is set on a file?

rxgh99 commented 1 year ago

I have not tested all xattrs, but based on my (limited) testing it only appears to happen when com.apple.ResourceFork is present as an xattr.

WITH com.apple.ResourceFork
$ borg list test-repo::xattr3
-rw-r----- rxgh99   admin   1522481 Fri, 2012-06-15 18:26:28 VFX_Manual.pdf  //com.apple.ResourceFork present
-rw-r----- rxgh99   admin    277504 Tue, 2011-02-01 20:55:39 WI_setup.doc    //com.apple.ResourceFork present
-rw-r----- rxgh99   admin   1522481 Fri, 2012-06-15 18:26:28 test-norf.pdf   //no xattrs
-rw-r--r-- rxgh99   admin         0 Thu, 2022-12-29 00:32:03 test1.txt       //com.apple.ResourceFork present
-rw-r--r-- rxgh99   admin         0 Thu, 2022-12-29 00:31:56 test2.txt       //no xattrs

$ ls -l restore3
-rw-r-----@ 1 rxgh99  admin  1522481 Dec 29 11:01 VFX_Manual.pdf  <-- date at restore time
-rw-r-----@ 1 rxgh99  admin   277504 Dec 29 11:01 WI_setup.doc    <-- date at restore time
-rw-r-----@ 1 rxgh99  admin  1522481 Jun 15  2012 test-norf.pdf
-rw-r--r--@ 1 rxgh99  admin        0 Dec 29 11:01 test1.txt       <-- date at restore time
-rw-r--r--  1 rxgh99  admin        0 Dec 29 00:31 test2.txt  

restore3 $ xattr WI*
com.apple.ResourceFork
com.apple.metadata:kMDItemWhereFroms

REMOVED com.apple.ResourceFork from WI_setup.doc
$ borg list test-repo::xattr4
-rw-r----- rxgh99   admin   1522481 Fri, 2012-06-15 18:26:28 VFX_Manual.pdf //com.apple.ResourceFork still present
-rw-r----- rxgh99   admin    277504 Tue, 2011-02-01 20:55:39 WI_setup.doc   //Removed com.apple.ResourceFork
-rw-r----- rxgh99   admin   1522481 Fri, 2012-06-15 18:26:28 test-norf.pdf
-rw-r--r-- rxgh99   admin         0 Thu, 2022-12-29 00:32:03 test1.txt
-rw-r--r-- rxgh99   admin         0 Thu, 2022-12-29 00:31:56 test2.txt

$ ls -l restore4
total 6512
-rw-r-----@ 1 rxgh99  admin  1522481 Dec 29 11:03 VFX_Manual.pdf  <-- date at restore time
-rw-r-----@ 1 rxgh99  admin   277504 Feb  1  2011 WI_setup.doc    <-- original date
-rw-r-----@ 1 rxgh99  admin  1522481 Jun 15  2012 test-norf.pdf
-rw-r--r--@ 1 rxgh99  admin        0 Dec 29 11:03 test1.txt       <-- date at restore time
-rw-r--r--  1 rxgh99  admin        0 Dec 29 00:31 test2.txt

restore4 $ xattr VFX*
com.apple.Preview.UIstate.v1
com.apple.ResourceFork
com.apple.metadata:kMDItemWhereFroms

restore4 $ xattr WI*
com.apple.metadata:kMDItemWhereFroms
ThomasWaldmann commented 1 year ago

Hmm, there is no code in borg that special-cases ResourceFork.

So, is it maybe macOS "jumping on" that freshly extracted file and touching it, modifying its timestamp? Maybe some indexing or security stuff?

rxgh99 commented 1 year ago

That I do not know - I was wondering if it was a "timing" issue of some sort, between the extract process and, as you suggest, some OS activities. But then why might the OS only touch those with com.apple.ResourceFork xattr attribute and not others? It's an old attribute, for sure.

I previously tried a similar process with Duplicacy and Arq. Duplicacy showed the same behavior as here but Arq restored the original modified date.

jdchristensen commented 1 year ago

Is there something like linux's strace that you could use to see what syscalls borg extract is making, just to confirm whether it is borg or the OS?

jdchristensen commented 1 year ago

I noticed that borg seems to set the mtime before setting the xattrs. Would it help to do it in the other order? Does MacOS update the mtime when that particular xattr is set?

ThomasWaldmann commented 1 year ago

https://github.com/borgbackup/borg/blob/1.2.3/src/borg/archive.py#L885

this is how borg 1.2.3 extracts the metadata (after extracting file data, but before closing the file).

ThomasWaldmann commented 1 year ago

@jdchristensen yeah, changing order might help.

but if there is some race with macOS touching fresh files, it would depend on who wins the race.

also it does not explain why it only happens with ResourceFork xattr entry.

ThomasWaldmann commented 1 year ago

From samba git:

On Darwin other than all the normal filesystem operations, 'Finder' (like Explorer in Windows but a little more) keeps its information in two extended attributes named 'com.apple.FinderInfo' and 'com.apple.ResourceFork'.

If these xattrs are not implemented the filesystem won't be shown on Finder, and if they are not implemented properly there may be issues when some of the file operations are done through GUI of Finder. But when a filesystem is used over mountpoint in a terminal, everything is fine and these xattrs are not required.

ThomasWaldmann commented 1 year ago

Note: I am trying to write a test for this.

Update: test reproduces the issue. not only mtime is spoiled, but also atime. birthtime is ok.

Same test succeeds with another xattr name, so it is really specific to com.apple.ResourceFork.

rxgh99 commented 1 year ago

Is there something like linux's strace that you could use to see what syscalls borg extract is making, just to confirm whether it is borg or the OS?

There is dtruss that encapsulates dtrace, but need to remove SIP protection to run.

ThomasWaldmann commented 1 year ago

@jdchristensen no, moving the timestamp setting code close to the end (right before the place where it might set the immutable flag) does not help.

Correction: it fixes mtime, but not atime.

ThomasWaldmann commented 1 year ago

@rxgh99 btw, congrats for finding this. guess this is a rather old and "special" bug...

ThomasWaldmann commented 1 year ago

This might be even an issue in macOS. Guess setting the ResourceFork xattr should not modify mtime because setting any other xattr does not do that either. Likely same for atime.

So, the "bug" in borg is just that we can work around this for mtime and this is what PR #7235 implements. The issue is still there for atime, though.

I reproduced this on macOS Ventura 13.1 (M1).

rxgh99 commented 1 year ago

Old indeed ;) But it (the resource fork) can tend to litter prior installations and data, especially data that may have traversed a few years and OS versions. Would be rather good, though, if a resolution was found, for consistency across all xattrs.

I have not seen if additional time fields, other than modified and created, are affected by e.g. Arq. But that is the only tool so far that I've tested that restored those fields per the original values (modified and created times).

Which time fields are you testing for - mtime, atime, others? I can test with Arq to see if there are similarities or not.

ThomasWaldmann commented 1 year ago

Tested: birthtime, mtime (and atime, but that is commented out in the final test code).

OK: birthtime

Fixed / worked around after #7235: mtime

Not fixed: atime (but guess we can not fix this in borg)

ctime can not be set by userspace.

rxgh99 commented 1 year ago

A quick and dirty compare. Coding is relative to the original: green = exact match; red = mismatch to original. Generally, Arq gets matches on mtime and btime and appears to set atime to mtime. This also seems to be what happens with Borg on the file without com.apple.ResourceFork (test-norf.pdf).

restore-compare

Borg extracted files:

$ stat -f "%N%n atime:%Sa%n ctime:%Sc%n mtime:%Sm%n btime:%SB%n" *
VFX_Manual.pdf
 atime:Dec 29 17:15:00 2022
 ctime:Dec 29 17:14:59 2022
 mtime:Dec 29 17:14:59 2022
 btime:Jun 15 18:26:28 2012
WI_setup.doc
 atime:Dec 29 17:15:00 2022
 ctime:Dec 29 17:14:59 2022
 mtime:Dec 29 17:14:59 2022
 btime:Feb  1 20:55:39 2011
test-norf.pdf
 atime:Jun 15 18:26:28 2012
 ctime:Dec 29 17:14:59 2022
 mtime:Jun 15 18:26:28 2012
 btime:Jun 15 18:26:28 2012

Arq extracted files:


$ stat -f "%N%n atime:%Sa%n ctime:%Sc%n mtime:%Sm%n btime:%SB%n" *
 VFX_Manual.pdf
 atime:Jun 15 18:26:28 2012
 ctime:Dec 29 17:43:35 2022
 mtime:Jun 15 18:26:28 2012
 btime:Jun 15 18:26:28 2012
WI_setup.doc
 atime:Feb  1 20:55:39 2011
 ctime:Dec 29 17:43:38 2022
 mtime:Feb  1 20:55:39 2011
 btime:Feb  1 20:55:39 2011
test-norf.pdf
 atime:Jun 15 18:26:28 2012
 ctime:Dec 29 17:43:31 2022
 mtime:Jun 15 18:26:28 2012
 btime:Jun 15 18:26:28 2012
rxgh99 commented 1 year ago

For completeness: if I create an archive with --atime, files without com.apple.ResourceFork are extracted with original atime. (Files with com.apple.ResourceFork exhibit current behavior). Note values for test-norf.pdf below.

 $ stat -f "%N%n atime:%Sa%n ctime:%Sc%n mtime:%Sm%n btime:%SB%n" *
VFX_Manual.pdf
 atime:Dec 30 00:41:14 2022
 ctime:Dec 30 00:41:13 2022
 mtime:Dec 30 00:41:13 2022
 btime:Jun 15 18:26:28 2012
WI_setup.doc
 atime:Dec 30 00:41:14 2022
 ctime:Dec 30 00:41:13 2022
 mtime:Dec 30 00:41:13 2022
 btime:Feb  1 20:55:39 2011
test-norf.pdf
 atime:Apr  8 12:37:33 2022  <-- original atime, preserved with --atime
 ctime:Dec 30 00:41:13 2022
 mtime:Jun 15 18:26:28 2012
 btime:Jun 15 18:26:28 2012