alexcrichton / tar-rs

Tar file reading/writing for Rust
https://docs.rs/tar
Apache License 2.0
616 stars 178 forks source link

Intermittent/partially unpacked files #320

Closed milesj closed 1 year ago

milesj commented 1 year ago

Hey all,

Been running into a problem that I've been unable to figure out. I'm building moon (https://github.com/moonrepo/moon), a multi-language task runner and monorepo tool. We use tar heavily to cache build outputs, and unpack them on cache hits. We've seen many reports that unpacking would intermittent unpack a file halfway through, and stop without error. This results in broken builds.

Here's an example issue: https://github.com/moonrepo/moon/issues/735 And a Discord thread: https://discord.com/channels/974160221452763146/1101545513439997962

Our untar implementation is almost the same as the example from this repo, here's the code: https://github.com/moonrepo/moon/blob/master/crates/core/archive/src/tar.rs#L212

The only different is that we added a difference checking layer, so avoid deleting/unpacking files when not necessary. This helped speed up our unpacking flow by almost 10x. This is the diff function, it's pretty simple: https://github.com/moonrepo/moon/blob/master/crates/core/archive/src/tree_differ.rs#L90

Would appreciate any insight into this problem, and if I'm using the tar crate incorrectly, or if you notice something wrong with my Rust code. Thanks again!

milesj commented 1 year ago

One thing to add, is that we can unpack the tarball manually and everything is correct. So that rules out the tarball being corrupted, or incorrectly created.

milesj commented 1 year ago

Believe we figured this out, was a problem with the diffing impl.