python / cpython

The Python programming language
https://www.python.org
Other
63.79k stars 30.55k forks source link

zipfile: use correct system type on unixy systems #42827

Closed ronaldoussoren closed 18 years ago

ronaldoussoren commented 18 years ago
BPO 1412872
Nosy @loewis, @ronaldoussoren
Files
  • zipfile.patch
  • zipfile-2.patch: updated version
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['library'] title = 'zipfile: use correct system type on unixy systems' updated_at = user = 'https://github.com/ronaldoussoren' ``` bugs.python.org fields: ```python activity = actor = 'loewis' assignee = 'none' closed = True closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'ronaldoussoren' dependencies = [] files = ['6976', '6977'] hgrepos = [] issue_num = 1412872 keywords = ['patch'] message_count = 8.0 messages = ['49396', '49397', '49398', '49399', '49400', '49401', '49402', '49403'] nosy_count = 3.0 nosy_names = ['loewis', 'ronaldoussoren', 'pete_forman'] pr_nums = [] priority = 'normal' resolution = 'accepted' stage = None status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue1412872' versions = ['Python 2.5'] ```

    ronaldoussoren commented 18 years ago

    This patch updates the contructor of zipfile.ZipInfo to initialize the create_system attribute to 3 instead of 0 on systems that are not Windows.

    Without this patch the unzip command won't honour the file mode that is stored in the zip file.

    ronaldoussoren commented 18 years ago

    Logged In: YES user_id=580910

    I've updated the patch based on a simular patch by Pete Forman (UID: pete_forman). This version contains a lookup-table of create_systems based on the os.name and also supports some other non-windows systems.

    4a48c1a3-d390-4d6f-967c-42eaeef5e2a4 commented 18 years ago

    Logged In: YES user_id=315964

    The merged patch looks good.

    One extra comment that I'd made was that if os.name is 'java' then a further query of java.lang.System.getProperty("os.name") might be in order.
    The string returned from that is 'Linux', 'Windows XP' and 'SunOS' on the platforms I can test. A quick search turned up this page: http://lopica.sourceforge.net/os.html

    On that basis I'd propose a rule that if the Java os.name starts with 'Windows' or 'OS/2' then use 0; if it starts with 'Mac OS' then 7; else 3. Perhaps someone with 'Mac OS X' could pronounce whether it ought to be 3 (UNIX) or 7 (Macintosh). Comments from Jython experts welcome as well.

    The spec for the ZIP file format is at PKWARE. http://www.pkware.com/business_and_developers/developer/ popups/appnote.txt

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 18 years ago

    Logged In: YES user_id=21627

    I think it is more complicated than that. In version 6.2.2, a value of 19 is assigned to OS/X. There might also be cases when 10 (Windows NTFS) is the right answer; it appears they look at the file system rather than the operating system.

    HOWEVER, it also says that the value should be 0 if the external file attributes are compatible with DOS and pkzip 2.04g.

    Perhaps it is a bug in unzip that it doesn't honor the file mode?

    4a48c1a3-d390-4d6f-967c-42eaeef5e2a4 commented 18 years ago

    Logged In: YES user_id=315964

    I would contend that UNIX file permissions are not compatible with MS-DOS though it is possible that PKZIP 2.04g might read them and convert them to an approximation. I would hope that an unzipping program would ignore any file type code that it did not know how to interpret. The ZIP spec explicitly says the the external file attributes are set to zero in the case where input is from standard input. I infer from that that an unzipper should at least be able to extract the bytes of a file.

    Rather than argue over fine points of interpretation I think that we should take the pragmatic view that a ZIP archive is most likely to be unzipped on the same type of file system that it was created on.

    The file system code is indeed more complicated. Setting one type for the archive is limiting. It ought to be set for each member. And yes, it should be file system rather than OS.

    unzip does honor the file mode. External file attributes must be interpreted according to 'version made by'. It does have options such as '-X restore UID/GID info' but those control how much metadata is unpacked, not override incorrect fields.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 18 years ago

    Logged In: YES user_id=21627

    I looked at the "compatible with DOS" issue again, and I think the story is this:

    The low-order byte of the external file attributes are the DOS bits (read-only, hidden, system, directory); one can claim compatibility if these bits have the right values. The Unix bits are the higher 16 bits, and can have indepedent values.

    InfoZIP (on Unix) will use the Unix bits as is-if the creator system is UNIX, VMS, ACORN, ATARI, ATHEOS, BEOS, QDOS or TANDEM. For FAT (0), it will use the Unix bits if they are "consistent" with the DOS bits. In all other cases (including NTFS and MAC), it will use the DOS bits.

    "consistent" is defined wrt. the U bits: the DOS directory bit must be the same as the u+x bit, and the u+w bit must be the negated read-only bit.

    If only the DOS bits are used, they are used to synthesize POSIX bits: u,g,o get all the same bits, the w bits are derived from read-only, and the x bit is derived from the subdir bit.

    As for -X (restore UID/GID): They come from the PK Unix extra field (0x000d), the InfoZip Unix extra field (0x5855) or the InfoZIP Unix2 extra field (0x7855). We currently don't create any of these, so this option in unzip won't have any effect.

    ronaldoussoren commented 18 years ago

    Logged In: YES user_id=580910

    Martin,

    Thank you for investigating this. It seems to me that my original patch should be save then, it sets the creator_system to unix on unix-y systems only.

    The current version of zipfile sets the higher 16 bits external_attr to the file mode as returned os.stat and doesn't set the lower bits at all (the comment mentions 'unix attributes').

    As you already noticed zipfile doesn't set any other attributes, such as user and group, and IMHO that wouldn't be very useful for most usecases anyway.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 18 years ago

    Logged In: YES user_id=21627

    I agree, and have committed the original patch as r42250.