pombreda / libarchive

Automatically exported from code.google.com/p/libarchive
Other
0 stars 0 forks source link

bsdtar in extraction mode truncates the directory timestamps (patch attached) #83

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a directory on a file system whose timestamps have nanosecond 
resolution,
 e.g. xfs or ext4
2. Archive the directory with "bsdtar --create --format=pax" (the older formats 
do not store 
complete timestamps)
3. Extract the archive with "bsdtar --extract --preserve-permissions", then 
check the directory 
timestamp with "ls -d --full-time"

What is the expected output? What do you see instead?
The expected result is to restore the original timestamp. I see instead that 
the timestamp is 
truncated to microsecond resolution.

What version are you using?
2.8.3

On what operating system?
Linux that has utimensat

How did you build?  (cmake, configure, or pre-packaged binary)
configure

What compiler or development environment (please include version)?
Does not matter.

Please provide any additional information below.
This bug is caused by the file "libarchive/archive_write_disk.c", where only 
the function
set_time at line 1841 was updated to use utimensat instead of utimes, but the 
function
_archive_write_close at line 1230 was not updated.

I have attached a patch that ensures that _archive_write_close uses utimensat, 
where available.
The patch also adds a lafe_warnc message in tar/bsdtar.c, which gives a 
temporary solution
for issue 82.

I am now using a patched bsdtar and I am happy with it. Nevertheless, I have 
written the patch 
after just 10 to 20 minutes of browsing the source files so someone familiar 
with the libarchive 
sources should review it. I have derived the HAVE_UTIMENSAT case from the 
HAVE_UTIMES 
case, but I do not know if there exists any system that has both HAVE_UTIMENSAT 
and 
HAVE_STRUCT_STAT_ST_BIRTHTIME and if the code is right for that case.

I want to add a comment. In my opinion, the first requirement for an archiving 
program is to 
be able to make an exact copy of the archived files. If this requirement is not 
fulfilled, then the 
archiving program is useless, regardless how fast or convenient it might be.

Sadly, in the year 2010, most widespread archiving programs that are available 
for UNIX file 
systems, either are not able at all to make exact copies, or they require 
non-default options that 
are not well documented. The two usual problems are the extended attributes 
(including ACLs) 
and the timestamps.

bsdtar is by far the best archiving program that I have found (from this point 
of view), but I still
had to waste a day to experiment with the options in order to discover that 
"--format=pax" is 
mandatory for archive creation and "--preserve-permissions" is mandatory for 
archive 
extraction, then I had to read the sources and patch the bug and now I finally 
have a bsdtar 
that can be used to backup XFS files.

At least the source files were quite clear, thus I could find quickly what to 
change, so their 
authors did a very good job.

Original issue reported on code.google.com by a.bocani...@computer.org on 25 Mar 2010 at 7:21

Attachments:

GoogleCodeExporter commented 9 years ago
Are you sure about the HAVE_STRUCT_STAT_ST_BIRTHTIME fragment? It looks like the
second call to utimensat is redundant?

Please keep patches specific to the issue at hand.

Original comment by joerg.sonnenberger@googlemail.com on 25 Mar 2010 at 7:28

GoogleCodeExporter commented 9 years ago
I have deleted from the patch given above the warning message that did not 
belong to this issue and I have 
attached the new patch here.

Regarding the HAVE_STRUCT_STAT_ST_BIRTHTIME case, I have just copied the code 
from the 
HAVE_UTIMES case, replacing utimes with utimensat.

I have no idea what this birthtime means and why on such systems utimes is 
called twice. Whoever wrote 
that code should know the reason.

I assume that this birthtime applies to some file systems that maintain a file 
creation time, and that there 
exists some operating system that uses this non-standard means to set it. 
Whoever wrote that code should 
have commented it better.

Anyway, the HAVE_STRUCT_STAT_ST_BIRTHTIME case is not used on Linux, so I 
cannot test it. Like I said in 
my first post, I do not know if there exists any OS where both 
HAVE_STRUCT_STAT_ST_BIRTHTIME and
HAVE_UTIMENSAT are true, so that part of the code might be never used.

Original comment by a.bocani...@computer.org on 25 Mar 2010 at 8:16

Attachments:

GoogleCodeExporter commented 9 years ago
Tim, you did the original change in r212. Please double check the intention of 
the
original fragment and apply the patch.

Original comment by joerg.sonnenberger@googlemail.com on 25 Mar 2010 at 8:41

GoogleCodeExporter commented 9 years ago
joerg.sonnenberger wrote:
> Are you sure about the HAVE_STRUCT_STAT_ST_BIRTHTIME fragment?
> It looks like the second call to utimensat is redundant?

According to FreeBSD's utimes(2) manpage:

"For file systems that support file birth (creation) times (such as UFS2), the 
birth time will be set to the value of the second element if the second element 
is older than the currently set birth time.  To set both a birth time and a 
modification time, two calls are required; the first to set the birth time and 
the second to set the (presumably newer) modification time.  ..."

I do not know if this behavior has been reproduced in utimensat, if this system 
call was introduced in newer 
versions of FreeBSD, because I am still a happy user of "FreeBSD 4.11-STABLE" 
(six years of uptime without 
a minute of downtime) :-)

Original comment by a.bocani...@computer.org on 25 Mar 2010 at 9:40

GoogleCodeExporter commented 9 years ago
OK, please add a proper comment about that then.

Original comment by joerg.sonnenberger@googlemail.com on 25 Mar 2010 at 9:56

GoogleCodeExporter commented 9 years ago
I have looked at the man pages on the FreeBSD site and utimensat appears 
neither in FreeBSD 8 nor in 
FreeBSD 9.

Therefore the lines belonging to the HAVE_STRUCT_STAT_ST_BIRTHTIME case will 
never be compiled yet.

When the FreeBSD developers will introduce utimensat, it will be logical to 
make it behave exactly like 
utimes does now. In that case the code from the patch will be correct.

The alternative is that suggested in the FreeBSD utimes(2) man page: "Ideally a 
new system call
will be added that allows the setting of all three times at once."

If this will happen (not very likely), the existing code will have to be 
modified to use the new system call.

I have modified the patch by adding a comment containing these facts and I have 
attached it here.

Original comment by a.bocani...@computer.org on 25 Mar 2010 at 10:29

Attachments:

GoogleCodeExporter commented 9 years ago
Since the version 5.0, NetBSD has adopted the birthtime feature from FreeBSD, 
so this is the second 
operating system for which HAVE_STRUCT_STAT_ST_BIRTHTIME is true. OpenBSD & 
DragonFlyBSD do not 
appear to have adopted the birthday feature.

However, like FreeBSD, NetBSD does not have utimensat and it is unlikely that 
they will introduce this 
feature before FreeBSD or in an incompatible way.

Therefore, I do not think that is necessary to modify the comment posted above 
to also mention the fact that 
what is said about FreeBSD is also true for the recent versions of NetBSD.

Original comment by a.bocani...@computer.org on 25 Mar 2010 at 11:20

GoogleCodeExporter commented 9 years ago
OK, I was lazy.

I have modified the patch by extending the comment.
Now the comment really contains all the known facts.
I have attached the new patch.

Original comment by a.bocani...@computer.org on 26 Mar 2010 at 9:17

Attachments:

GoogleCodeExporter commented 9 years ago
I believe that in your web home page, where you list "The libarchive library 
features:", you should add a 
bullet and mention the fact that, when using the 2001 PAX format, libarchive is 
able to store all the 
information pertaining to a file, including extended attributes, ACLs and full 
timestamps, even when they 
have nanosecond resolution.

In my opinion, the ability of making an exact copy of any archived file is even 
more important then the other 
features that you list there.

GNU tar from the standard distribution lacks this feature. There are patched 
versions of GNU tar, e.g. 
RedHat, that might be better, but I did not test them and I doubt that they 
preserve the full timestamps.

The cpio and pax programs that are available in most distributions are old 
programs that do not preserve 
extended attributes or timestamps.

star, when given a lot of non-default options, preserves extended attributes & 
ACLs, but truncates the 
timestamps.

Therefore, bsdtar is the only program that I have found and that is able to 
make exact copies of files from 
UNIX file systems and you should advertise better this fact.

While the various "dump" programs are able to make exact copies, they are not 
suited for most applications 
because it is much more likely to need to restore the files on a different hard 
disk than the original one.

Original comment by a.bocani...@computer.org on 26 Mar 2010 at 10:22

GoogleCodeExporter commented 9 years ago
Re: timestamp patch

Please try the attached patch.  This takes your idea a step further by 
refactoring
some of the existing time setting functions.  In particular, there is now only 
one
place that calls utimensat, only one place that knows about the BSD birthtime
convention, etc.  It's a considerably more intrusive patch but I think the 
result is
a bit cleaner.  Let me know if this works properly for you.

Re: "when using the 2001 PAX format, libarchive is able to store all the 
information
pertaining to a file"

Unfortunately, this is not yet universally true, though with the ongoing help 
and
support of folks like yourself, we are getting closer.  ;-)  Probably the most
important omission at the moment is support for NFS4/NTFS ACLs.

Original comment by kientzle@gmail.com on 4 Apr 2010 at 5:36

Attachments:

GoogleCodeExporter commented 9 years ago
I have tried the "dir_tstamp3.patch" and it seems to be all right. Actually I 
was also of the same opinion with 
you when I saw the duplicated code for setting times that it should be unified 
in a single set_time function, 
but, because I was not familiar with your code, I preferred in my initial patch 
to make minimal changes to 
your sources.

However, if you have already cleaned up most of the code, you should finish it 
and take the prototype of the 
set_time function, including the opening "{", outside the conditional 
compilation branches. Likewise, the 
closing "}" of the set_time function should be taken outside the conditional 
compilation branches.

As it is now, when the function prototype is repeated a large number of times, 
it is confusing and it also it 
presents an opportunity of errors, if someone would modify by mistake only one 
of the copies of the 
function prototype.

Best regards !

Original comment by a.bocani...@computer.org on 11 Apr 2010 at 6:29

GoogleCodeExporter commented 9 years ago
Committed to libarchive/trunk as r2231.

Thank you very much for your help working through this problem.  I'm quite 
pleased
with the final result.

Original comment by kientzle@gmail.com on 11 Apr 2010 at 7:15