pombreda / libarchive

Automatically exported from code.google.com/p/libarchive
Other
0 stars 0 forks source link

Problems in test_write_format_iso9660_filename #101

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
It seems to me that the added files and the extracted files from the created 
iso file in test_write_format_iso9660_filename does not match, or atleast the 
output has many more files then was actually added.

I guess there's a problem either in the iso9660 joliet extension WRITE or READ 
support.

What steps will reproduce the problem?
1. Edit libarchive/test/test_write_format_iso9660_filename.c and add debug 
printout of the filenames added to the iso in the "add_entry" helper function 
then add debug printout of extracted/parsed files last in the "verify_file" 
helper function.
2. run test to capture added and extracted/parsed files debug output.
3. sort the output of each into a file and diff these files.

What is the expected output? What do you see instead?
I'd expect the diff to say there are no changes. Instead there where alot of 
extra files extracted/parsed.

Please use labels and text to provide additional information.

The diff is attached for your convenience.

This problem was found by enabling "trailing dot stripping" in the joliet 
extension read support iso9660 code. This made duplicates appear in the test. 
With closer investigation the problem doesn't seem to be actual duplicates but 
what was described above. An exact check of added/found files would probably be 
better in the test then just a check for duplicate files.

Original issue reported on code.google.com by a.henrik...@gmail.com on 4 Jul 2010 at 7:12

Attachments:

GoogleCodeExporter commented 9 years ago
for more info / previous discussion see: r2514, r2515, r2520, r2521

Original comment by a.henrik...@gmail.com on 4 Jul 2010 at 7:17

GoogleCodeExporter commented 9 years ago
I think ISO writer shouldn't allow trailing dot characters if ISO reader trims 
trailing a dot character.

--- libarchive/archive_write_set_format_iso9660.c       (revision 2526)
+++ libarchive/archive_write_set_format_iso9660.c       (working copy)
@@ -5951,6 +5951,14 @@
                if ((int)l > ffmax)
                        l = ffmax;
                np->identifier = (char *)p;
+               /* Do not allow trailing dot(.) characters.
+                * For example, 'foo.' or 'foo...' */
+               for (lt = l - 2; lt >= 0; lt -= 2) {
+                       if (p[lt] == 0 && p[lt+1] == 0x2E)/* '.' */
+                               archive_be16enc(&p[lt], 0x005F); /* '_' */
+                       else
+                               break;
+               }
                lt = l;
                dot = p + l;
                weight = 0;

Original comment by ggcueroad@gmail.com on 4 Jul 2010 at 6:21

GoogleCodeExporter commented 9 years ago
The writer should not change filenames any more than is necessary.

Original comment by kientzle@gmail.com on 4 Jul 2010 at 7:20

GoogleCodeExporter commented 9 years ago
A new patch:

--- libarchive/archive_write_set_format_iso9660.c       (revision 2526)
+++ libarchive/archive_write_set_format_iso9660.c       (working copy)
@@ -5951,6 +5951,11 @@
                if ((int)l > ffmax)
                        l = ffmax;
                np->identifier = (char *)p;
+               /* If the last character of a filename is '.', Windows
+                * standard tools(such as cmd.exe, explorer.exe or notepad.exe)
+                * cannot access the file. */
+               if (p[l-2] == 0 && p[l-1] == 0x2E) /* '.' */
+                       archive_be16enc(&p[l-2], 0x005F); /* '_' */
                lt = l;
                dot = p + l;
                weight = 0;

I made some files through cygwin and checked whether
those can be accessed by Windows tools or not.

filename cmd.exe explorer.exe notepad.exe
 f         o          o            o
 f.        x          x            x
 f..       x          x            x
 f._       o          o            o
 f.._      o          o            o

  o: OK.
  x: NG, access 'f' file.

Original comment by ggcueroad@gmail.com on 5 Jul 2010 at 7:20

GoogleCodeExporter commented 9 years ago
I think the patches posted are quite off-topic. It might be nice to not write 
trailing dots in filenames, but assuming all other iso readers strip trailing 
dots and we can enable the code to do so as well it's not a problem.

The real problem here is why, when adding foo to the iso, we read back foo, 
foo000, foo001, ..., foo00A, ...

Where does all these extra files come from?

Original comment by a.henrik...@gmail.com on 8 Jul 2010 at 9:10

GoogleCodeExporter commented 9 years ago
There are too many filenames in la-added-found.diff and including ISO9660 
8.3/rockridge filenames.
It seems a process of making la-found-sorted.txt was wrong.

I made la-added-found-sorted.diff with a patch below here:
Index: libarchive/test/test_write_format_iso9660_filename.c
===================================================================
--- libarchive/test/test_write_format_iso9660_filename.c    (revision 2527)
+++ libarchive/test/test_write_format_iso9660_filename.c    (working copy)
@@ -24,6 +24,7 @@
  */
 #include "test.h"

+static FILE *fp = NULL;
 /*
  * Check that an ISO 9660 image is correctly created.
  */
@@ -44,6 +45,8 @@
    archive_entry_set_size(ae, 0);
    assertEqualIntA(a, ARCHIVE_OK, archive_write_header(a, ae));
    archive_entry_free(ae);
+   if (fp)
+       fprintf(fp, "%s\n", fname);
 }

 struct fns {
@@ -135,6 +138,8 @@
    }
    /* Save the filename which is appeared to use above next time. */
    fns->names[fns->cnt++] = strdup(archive_entry_pathname(ae));
+   if (fp)
+       fprintf(fp, "%s\n", archive_entry_pathname(ae));
 }

 static void
@@ -315,7 +320,10 @@
    /*
     * Create ISO image with no option.
     */
+   fp = fopen("/tmp/la-added.txt", "w");
    fcnt = create_iso_image(buff, buffsize, &used, NULL);
+   fclose(fp);
+   fp = NULL;

    fns.names = (char **)malloc(sizeof(char *) * fcnt);
    assert(fns.names != NULL);
@@ -331,7 +339,10 @@
    fns.longest_len = 0;
    fns.maxlen = fns.maxflen = fns.maxelen = 64;
    fns.opt = ALLOW_LDOT;
+   fp = fopen("/tmp/la-found.txt", "w");
    verify(buff, used, JOLIET, &fns);
+   fclose(fp);
+   fp = NULL;

    /* Verify ISO9660 filenames. */
    fns.longest_len = 0;

Original comment by ggcueroad@gmail.com on 8 Jul 2010 at 1:13

Attachments:

GoogleCodeExporter commented 9 years ago
Has this issue been resolved in libarchive/trunk?

Is there any reason to hold up the libarchive 3.0 release for it?

Original comment by kientzle@gmail.com on 26 Nov 2011 at 9:59

GoogleCodeExporter commented 9 years ago
I looked at this a little bit and haven't yet found anything alarming, though I 
agree the test could be improved.

The test recreates the ISO image 6 times (with different options) and reads 
those images back a total of 17 times (with different options), so you should 
expect to see names appear different numbers of times in add_entry and 
verify_file.

It would be nice to improve the test to verify that each filename was read back 
exactly as many times as expected.  This would require adding some bookkeeping 
to the existing code, of course.  The tricky part is accounting for how 
different ISO formats change names (plain ISO9660 converts names to uppercase, 
for example).

Original comment by kientzle@gmail.com on 31 Dec 2011 at 5:31