Closed Eswcvlad closed 3 years ago
When adding a file entry, set the uncompressed_size
field of file_info
and it will pre-determine if it needs zip64 or not.
In my case I need to do without streaming/data descriptor and just have the full sizes in the local header.
I am passing the uncompressed size and it detects it properly. The issue is that if the compressed size will also not fit in 32-bit in the end this won't be visible in the local header: it will just have 0xFFFFFFFF in the base field and no corresponding entry in the Zip64 extra field.
For the spec there is later "This entry in the Local header MUST include BOTH original and compressed file size fields.", so I would assume it means having both 0xFFFFFFFF and put both in the extra field, even if they fit. Though the spec somewhat contradicts itself on Zip64, so I am not sure whether my assumption is correct or not...
@Eswcvlad please review the change I have made.
@nmoinvaz I am guessing the stream_buf changes is the reason I wasn't able to read from that stream before?
Tested locally on my test suite and I've done some bugfixes here: https://github.com/Eswcvlad/minizip/commit/b322d3c904ae5a10d2c5253643a73f2d196855f3 . With this it seems to work properly on win/mac/linux. It also seems to be fine with data descriptor enabled too. The disk_offset change most likely is not required, but I've made it anyway as the entry in the local header confused zipdetails.
Additionally two other related tweaks, that might be useful: https://github.com/Eswcvlad/minizip/commit/e2e375dd1f2b56cc7529e0f3044226180f1e2a7b https://github.com/Eswcvlad/minizip/commit/5093eb6f16a5c48dfce94288403d488b1f98e8b5
I think with regard to _diskoffset the part I was previously missing was the !local
check here:
https://github.com/nmoinvaz/minizip/blob/123c49e94edd144d6ec2d1fc29fc2f4ee8de6a07/mz_zip.c#L794-L795
@Eswcvlad Yes, it is the reason. I'm hoping the change I made works. Originally the buffered stream was only meant for either read or write (not both). Time will tell.
All the changes should be made now.
@nmoinvaz the order of fields is still incorrect here: https://github.com/nmoinvaz/minizip/blob/3afaad4d9ddd8bcc6af8c6f4c7e25478de703bc2/mz_zip.c#L2181-L2186
I have made a change to fix the ordering. However, I am not sure how I feel about the change to the posix stream to w+b
. And perhaps wondering if there is a better way to do the zip64 offset calculation that doesn't require reading.
If we always write ZIP64 extension first in extrafield, then it will be possible to calculate fixed offset much easier. What do you think @Eswcvlad?
Seems good. I'll run my tests later today to recheck just in case.
https://github.com/nmoinvaz/minizip/commit/d183a4061f1f9302cc656373db8fe9d3e5322179 broke dir handling, as directories may appear after the 4 GB mark and would require Zip64 for the disk offset.
Otherwise seems to work fine.
Thanks. I have made a fix to that which will only turn it off zip64 for directory entries in local headers. Central directory headers should still have zip64 for directories due to the issue you mentioned.
I've tested the changes and it still has issues. Since in the end when we update the Zip64 data in the local header, we base it on the needs_zip64 at the central directory on entry close.
So, currently, for directories outside the 4 GB offset there won't be an entry in the local header, but there will be in the central header and the code for updating the local header will overwrite some other extra field or data, as there is no Zip64 extra field there. Somewhat opposite issue with the cushion: in that case it will instead just not update the entry in the local header in case cushion wasn't necessary.
Ideally it would be better to cache whether Zip64 was used in the local header or not (or just the offset to the Zip64 extra field at that point) and use that during the local header update on entry close. Or try to keep it so it is never the case, that there is no Zip64 in local header (i.e. entry open), but it is present in the central header (i.e. on entry close).
I've been testing zipping of big files (basically just full on random content files) and noticed an issue with compressed size values in the Zip64 extra field in the local header when streaming mode is disabled.
The issue is this:
Also there is a similar twin issue, when uncompressed size is less than UINT32_MAX, but compressed size is bigger. The result is similar, but now without the Zip64 extra entry in the local header at all.
So with that we have a proper central directory header, as we have all of the info at that point, but it is impossible to get the compressed size from the local header. This breaks some of the unzipping tools.
I've made a sloppy attempt to fix it here: https://github.com/Eswcvlad/minizip/commit/3b01a375ea0b9bb48a5ffdb9d24eee119be69893 .
The fix boils down to the following:
It seems to be working, but it is sloppy, affects streaming mode and I am not sure if it is save for other cases (like archive splitting).
Could you have a look at this?