Open liuliu opened 7 years ago
Excellent idea, proposal and implementation!
This change is actually generally useful as a slight optimization for smaller zip files and follows neatly on from @ed84124.
Only a couple of issues to fix. Please rebase all your changes to a single commit:
Minor nit: line 108 comment should mention data and no compression.
Issue: lines 278 - 304: logic here is sound but embeds another dependency on dataBlock
and compressionLevel == 0
in the data writing code, which is hard to figure out when we next update this to support e.g. always outputting size + crc in local header. Prefer to separate out the header update logic in a separate phase before the data writing i.e. a separate if (!_compressionLevel && _dataBlock) {...}
code. Once this code updates the header with the size + crc, then it should set a flag to inhibit the data descriptor write at line 337, instead of having to recheck _dataBlock
, generalPurposeBitFlag
etc.
This PR tried to fix the problem with ZipZap generated STORED format with ZipInputStream.
ZipInputStream is a widely used mechanism from JDK that supports reading Zip files. Since it deals with a stream of data, there are limitations in its implementation. For example, ZipInputStream cannot leverage the data from central file header to offset its payload. This is OK for deflated stream because deflated stream is self-descriptive, thus, by inflating the stream, it will naturally reach the end of the stream, and then pick up the data descriptor from the end of the payload for crc32 etc. However, if the payload is in STORED format, ZipInputStream relies on the local file header to provide information such as the uncompressed size, which in ZipZap case, skipped.
This works fine for other Zip file reader implementation or Java's own ZipFile since most of them are using central file header as the source of truth.
This PR fixed the problem by having the correct uncompressed / compressed size / crc32 code at the local file header if STORED format is used, and in that case, don't use data descriptor.
However, since we can only get the payload length at zero cause for data block payload, it is left to be desired about how to implement this for NSOutputStream / CGDataConsumerRef.
Please advice. Also, cc @rsanchezsaez