google / ripunzip

Other
161 stars 17 forks source link

Spurious failures when unzipping #74

Open criemen opened 2 months ago

criemen commented 2 months ago

Hi,

we're seeing spurious failures during unzip-file on CI which imo hint at a concurrency problem. Unfortunately, I don't have a good reproducer, but we do see this on CI every once in a while.

Errors we're seeing:

Error: invalid Zip archive: Invalid local file header
Error: Failed to extract legal/java.desktop/mesa3d.md

Caused by:
    0: Failed to write directory
    1: corrupt deflate stream
Error: Failed to extract bin/java

Caused by:
    0: Failed to write directory
    1: corrupt deflate stream

We're seeing this across multiple platforms (at least macos and windows, maybe also linux), and with multiple zip files. I'd guess the error rate is at or below 1%.

adetaylor commented 4 weeks ago

Thanks for the report - I agree that sounds bad. It's quite surprising though: there's a lot of concurrency complexity in the HTTP reads, but hardly any within the file reads.

Is it remotely possible that the length of the file is changing underneath our feet? This can confuse ripunzip. Perhaps I'll put in place checks for this (as best I can).

criemen commented 4 weeks ago

Is it remotely possible that the length of the file is changing underneath our feet?

That would be very surprising (or a pretty bad OS or bazel bug, but as we've seen this cross-platform it hardly can be a OS bug), as we're unzipping zip files provided from our build system - once that declares that the file has been written, it better be fully written to disk. Thanks for taking a look at this!