python / cpython

The Python programming language
https://www.python.org
Other
63.59k stars 30.47k forks source link

Cannot open zip64 file if Zip64EOCD record has additional data #126834

Open VladRassokhin opened 1 week ago

VladRassokhin commented 1 week ago

Bug report

Bug description:

"Zip64 end of central directory record" may have additional data in the "zip64 extensible data sector" field. In that case, zipfile should use offset from "Zip64 end of central directory locator". Now it's read but ignored.

def run_test():
    with zipfile.ZipFile("file-with-extended-zip64eocd-record.zip", allowZip64=True) as special_file: # This fails
        print("success")

Problem lies in following code:

    sig, diskno, reloff, disks = struct.unpack(structEndArchive64Locator, data)
    if sig != stringEndArchive64Locator:
        return endrec

    if diskno != 0 or disks > 1:
        raise BadZipFile("zipfiles that span multiple disks are not supported")

    # Assume no 'zip64 extensible data'
    fpin.seek(offset - sizeEndCentDir64Locator - sizeEndCentDir64, 2)
    data = fpin.read(sizeEndCentDir64)

While it should be:

    sig, diskno, reloff, disks = struct.unpack(structEndArchive64Locator, data)
    if sig != stringEndArchive64Locator:
        return endrec

    if diskno != 0 or disks > 1:
        raise BadZipFile("zipfiles that span multiple disks are not supported")

    fpin.seek(reloff, 0)
    data = fpin.read(sizeEndCentDir64)

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Linked PRs

VladRassokhin commented 1 week ago

Hex output of the end of such file:

1faafa80: 504b 0606 3100 0000 0000 0000 0000 0000
1faafa90: 0000 0000 0000 0000 767e 0100 0000 0000
1faafaa0: 767e 0100 0000 0000 3063 a700 0000 0000
1faafab0: 5097 031f 0000 0000 04b7 5893 1e50 4b06
1faafac0: 0700 0000 0080 faaa 1f00 0000 0001 0000
1faafad0: 0050 4b05 06ff ffff ffff ffff ffff ffff
1faafae0: ffff ffff ff00 00                      
VladRassokhin commented 1 week ago

I can submit MR in a couple of days. Seems I've found a solution.

VladRassokhin commented 1 week ago

Here's a python script which is able to generate a zip64 archive with some data in the zip64 end-of-archive record: https://gist.github.com/VladRassokhin/9299bb8fbe3169b96e7bc31f91553815 I've copied and adjusted zipfile.ZipFile._write_end_record