dotnet / android-libzipsharp

A managed wrapper (and then some) around libzip (https://libzip.org/)
MIT License
30 stars 10 forks source link

Fix the elusive invalid zip archive issue that has been a problem for ages! #142

Closed dellis1972 closed 3 months ago

dellis1972 commented 3 months ago

Fixes https://github.com/xamarin/xamarin-android/issues/8988

We had this odd corrupt zip file issue which kept cropping up on our Azure Pipelines builds. We had no idea what caused it until now. Some of the data for the local headers of an item (not the central directory) would be written incorrectly. This would result in a zip which may or may not be extractable, it would depend on how resilient the software extracting the data would be.

So, what was happening here was that (sometimes) libzip would start writing some data (most likely the local file header) using our stream source callback, and it would seek a few bytes into the data and then tried to seek back to the beginning. The latter seek was done by giving the seek operation of the callback an offset of 0 which, unfortunately, was also used by the code as a guard as to whether or not to even perform the seek operation. The effect was that we ignored the seek to 0 and the stream remained at whatever the previous seek location was requested, thus corrupting data. It happened only on the very first entry, since that was the only one which would have position 0 within its range.

We discovered that just enabling the strict consistency checks would uncover the issue, so that has been enabled in a number of unit tests. Once we did that it turns out we were writting the corrupt data ALL the TIME!. Fixing up the seeking code to take into account that we might want to see to 0 fixed the issue.

dellis1972 commented 3 months ago

@grendello when you get a mo can you check this over and check the commit message to see if its accurate please?

grendello commented 3 months ago

So, what was happening here was that (sometimes) libzip would start writing some data (most likely the local file header) using our stream source callback, and it would seek a few bytes into the data and then tried to seek back to the beginning. The latter seek was done by giving the seek operation of the callback an offset of 0 which, unfortunately, was also used by the code as a guard as to whether or not to even perform the seek operation. The effect was that we ignored the seek to 0 and the stream remained at whatever the previous seek location was requested, thus corrupting data. It happened only on the very first entry, since that was the only one which would have position 0 within its range.