pombreda / libarchive

Automatically exported from code.google.com/p/libarchive
Other
0 stars 0 forks source link

mtree file generation issue #264

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. mkdir tmp && cd tmp
2. mkdir foo bar
3. bsdtar -cf foo.mtree --format=mtree --options='!all,type' *

What is the expected output? What do you see instead?
The current output is:
#mtree
bar type=dir
foo type=dir

I would expect:
#mtree
bar type=dir
..
foo type=dir
..

What version are you using?
3.0.4

On what operating system?
Arch Linux

How did you build?  (cmake, configure, or pre-packaged binary)
./configure --prefix=/usr --without-xml2

What compiler or development environment (please include version)?
gcc version 4.7.0 20120505 (prerelease) (GCC) 

Please provide any additional information below.
When reading the currently generated mtree file, libarchive thinks the entries 
are for "foo" and "foo/bar". This is correct for the mtree file that is 
generated.

The first entry "bar type=dir" has no "/" in so is treated as a relative path.  
As it is a directory, all subsequent entries are considered to be within that 
directory.

Manually adding the ".." moves the current directory back down to the base 
level and then libarchive reads the entries correctly as "foo" and "bar"

Original issue reported on code.google.com by allan.mc...@gmail.com on 21 May 2012 at 7:05

GoogleCodeExporter commented 9 years ago
Attached is a patch that fixes this issue.

Test with:
mkdir test && cd test
mkdir foo bar
touch foo/foo bar/bar baz
bsdtar -cf test.mtree --format=mtree --options='!all,type'

Which gives the following mtree file:

#mtree
bar type=dir
bar/bar type=file
..
baz type=file
foo type=dir
foo/foo type=file

Technically there probably should be a ".." at the end too (as we are exiting 
the foo directory), but this is cosmetic being at the end of the file.

Original comment by allan.mc...@gmail.com on 6 Jun 2012 at 1:55

Attachments:

GoogleCodeExporter commented 9 years ago
Attached is an updated patch.  The only change is to print the cosmetic ".." at 
the end of the mtree file if needed.

Original comment by allan.mc...@gmail.com on 18 Jun 2012 at 3:14

Attachments:

GoogleCodeExporter commented 9 years ago
I agree there is a bug here, but I fear your fix is making it worse.  The 
libarchive mtree writer is supposed to support two very different mtree 
variants:
  * The "old" variant (which uses ".." entries to navigate the hierarchy)
  * The "NetBSD" variant (which stores full path names).  This is the format generated by "mtree -C" on NetBSD.

When reading mtree files, the two formats can be distinguished by the presence 
of '/' in the pathname.  The "old" variant NEVER has a '/' in the pathname; the 
"NetBSD" variant ALWAYS has a '/' in the pathname.

For example, for the directory generated by "mkdir bar && touch bar/bar", 
either of the two following files are correct:

  #mtree ("old" variant)
  bar type=dir
    bar type=file
  ..

OR

  #mtree (NetBSD -C variant)
  ./bar type=dir
  bar/bar type=file

The NetBSD variant has the advantage that it can be easily filtered with 
standard tools like 'grep'.  Unfortunately, it's not as widely supported as the 
old variant.

The output you showed in Comment #1 above has some entries with '/' and some 
without, which makes it invalid for either variant.

Original comment by kientzle@gmail.com on 18 Jun 2012 at 5:03

GoogleCodeExporter commented 9 years ago
As far as I can tell, there is nothing disallowing the use of both relative and 
and full pathnames in the "mtree v2.0" (NetBSD -C) format.  The 2.0 format only 
allows for full pathnames, in addition to the relative path name method of 
specification.  The only thing I see wrong with the output is that it does not 
have the "v2.0" string to confirm it is allowed full paths to files in it.  
Note that the libarchive mtree reader can read the mixed format fine.

I can readily provide a patch to have libarchive generate the v2.0 format by 
just adding a leading ./ in front of directories in the root path.  However, 
unless I am missing something completely obvious...  I can not see a way to 
tell libarchive/bsdtar to generate the "old" variant.  Is that supposed to be 
available now?

Original comment by allan.mc...@gmail.com on 18 Jun 2012 at 5:50

GoogleCodeExporter commented 9 years ago
Following this up... How would you like me to fix this issue?

If writing both versions of mtree file needs to be available, I propose 
providing writing functions for the v1 and v2 versions of the mtree format, and 
picking one for the current to default to. That would give (e.g.) 
archive_write_set_format_mtree_v1 and archive_write_set_format_mtree_v2.

The read function would not need multiple versions as it already deals with 
both format types.

Original comment by allan.mc...@gmail.com on 31 Jul 2012 at 10:12

GoogleCodeExporter commented 9 years ago
The "v1" and "v2" names are ones that I made up when I created the mtree.5 
manpage.  We should probably abandon those names, as noone else uses them.  I 
also made up the "#mtree" initial line requirement (wishful thinking on my 
part).  Again, that should be abandoned as it doesn't match current practice.  
(I thought some form of signature line would be necessary to do good 
auto-detection of mtree files, but Michihiro did some work that seems to have 
proven you can do reasonable auto-detect without i.)

Mixing the two formats is certainly possible but I doubt that the various tools 
are consistent in how they handle it (there's some ambiguity in the NetBSD 
manpages about how this works).  Best to generate just one or the other.

Personally, I would like to see the NetBSD format be the default, as it's more 
useful in general.  So I would propose that:
   archive_write_set_format_mtree --- generate the NetBSD -C format
   archive_write_set_format_mtree_??? --- generate the older format

I'm not sure what a good name might be.  "legacy" is a little judgmental.  
Perhaps "formatted"?  or "indented"?  Or maybe you have a better idea?

Or perhaps the older format should be the default and the NetBSD -C format 
should be called "full"?

Original comment by kientzle@gmail.com on 1 Aug 2012 at 5:24

GoogleCodeExporter commented 9 years ago
How about "archive_write_set_format_mtree_classic" for the indented version?

Original comment by allan.mc...@gmail.com on 1 Aug 2012 at 5:41

GoogleCodeExporter commented 9 years ago
That sounds great.

Original comment by kientzle@gmail.com on 1 Aug 2012 at 3:42

GoogleCodeExporter commented 9 years ago
Attached is a patch that fixes mtree generation to always prefix files in the 
root directory with "./".   This results in the mtrees generated in the "NetBSD 
-C variant" format.

Adding another function to generate the classic indented format requires much 
larger changes to the mtree writer.  I will not have time to address that in 
the short term...

Original comment by allan.mc...@gmail.com on 8 Sep 2012 at 9:20

Attachments:

GoogleCodeExporter commented 9 years ago
I just committed support for generating an older format.
I also added the ability to read the "NetBSD -D" variant a few days ago.

Original comment by ggcueroad@gmail.com on 26 Sep 2012 at 2:24