haf / DotNetZip.Semverd

Please use System.IO.Compression! A fork of the DotNetZip project without signing with a solution that compiles cleanly. This project aims to follow semver to avoid versioning conflicts. DotNetZip is a FAST, FREE class library and toolset for manipulating zip files. Use VB, C# or any .NET language to easily create, extract, or update zip files.
Other
545 stars 218 forks source link

Zip: omit Disk Start Number from the Zip64 Central Directory Entry (when possible) #262

Open BhaaLseN opened 2 years ago

BhaaLseN commented 2 years ago

Clause 4.5.3 of the PKWare spec clearly states that it MUST only appear in the Extra block when the corresponding field in the CDE (or LDE) was set to -1 (0xFFFF in the case of the Disk Start Number.)

Previously, this would only be the case when Zip64 was presumed or the disk number was actually outside the range for the CDE. In every other case, the Disk Start Number would go into both locations, violating the spec.

Some tools (such as 7-Zip) show warnings in this case, but often work as expected. Others (such as older versions of System.IO.Compression) follow the spec more stringently and will refuse to use the values from the Zip64 Extra block when the related values do not match their expectation.

However, fixing this to always write a conformant CDE that has its Disk Start Number as 0xFFFF with the actual value in the Extra block breaks other tools (such as marmelroy/Zip on iOS) that do not fully support Zip64.

The spec does allow the Extra block to omit the Disk Start Number (and the Relativ Header Offset) when they fit in the CDE, so we do just that for general compatibility.

Tested with both 7-Zip and System.IO.Compression (which would otherwise bail out and return 0x0000_0000_ffff_ffff as sizes,) as well as marmelroy/Zip (which never looks at the extra field for the disk start,) but I don't really see where (or rather: how) to add a sensible test for this. Mainly because I'd rather not reimplement Zip reading in the test, and I don't think pulling in 7-Zip for a test is the way to go. Nor is attempting to use System.IO.Compression, because they since made their code more accepting of non-conforming Zip files that are otherwise fine.

Fixes #260

vijfhoek commented 1 year ago

I'm currently running into this issue as well, as WinRAR fails to extract zip64 archives presumably because of this issue. Any chance to get this pull request merged?

BhaaLseN commented 1 year ago

Do note that we finally got around to trying this out with iOS on the other side (I don't remember the exact library at the moment, maybe marmelroy/Zip?) which initially got me to look at this...and it still fails. Presumably because the underlying minizip does not correctly look at the Zip64 Extra block when it contains the Disk Start Number but instead uses the -1; breaking more than before.

I have an alternate fix that simply writes a smaller Zip64 Extra block that omits the Disk Start Number completely; and that one works in all cases...but I don't have time at the moment to update my branch. So I'm setting this to Draft until I get around to do this.

BhaaLseN commented 1 year ago

Leaving this as Draft for now, since I didn't have the time (yet) to fully test this with more than 2 small zip files.

Rather than fixing the CDE to use -1 for the disk start number, this just omits it (because some tools still read and use it, despite it being in the extra field as per spec.)

Note that the build fail is from ruby and unrelated to my code changes.

vijfhoek commented 1 year ago

I've just tested your code with

And it worked like a charm in 7zip, WinRAR and Windows Explorer.

EDIT I just tried it in our code base, but sadly it still seems to be broken when encryption is enabled. I am not familiar enough with ZIP to say if this is actually related to this issue though

EDIT 2: I am failing to reproduce my issue using the ZipIt example, I may be calling it wrong in our code base - hold on

BhaaLseN commented 1 year ago

Thanks for the testing, but unfortunately I have no idea how encryption works in there. It is a flag somewhere, but this feels unrelated to Zip64 🤔

vijfhoek commented 1 year ago

I have narrowed my exact problem down to our repeated calling of the ZipFile.Save function, which seems to function incorrectly when Zip64 is enabled. This is unrelated to this PR, but still related to Zip64. I worked around it by.. just not calling Save repeatedly 🙂

MicrosoftTeams-image (4)

Note that this screenshot shows encryption being enabled, but the same happens when it is not.

I have taken a glance at the DotNetZip source, and it seems like the problem lies in ZipEntry.CopyThroughOneEntry. By commenting out that function, surrounding if statement and early return in ZipEntry.Write, the problem no longer occurs.

Tl;dr your PR seems to work exactly as it's supposed to, I'm having issues do to a mostly unrelated problem