I recently came across this issue while extracting a very large gzipped tar file with the ignore_zeros option.
About halfway through the archive there's a file with a data section that starts with blocks of nulls.
I think it would be useful to allow the read to continue even when errors are encountered (either by default or with an option). I've tested with GNU tar and the tarfile module in Python and they both have this behavior by default and extract all the files. Removing the line that sets the done flag in the error case gives the desired result. With this change all the current tests pass and all the files are extracted the same as GNU tar but I'm not sure if it's incorrect in other cases.
Here's a minimal archive to test with:
nullfile.tar.gz
The ignore_zeros flag must be set to extract everything, and the file with null contents is bJK/bJK5oTgxVJo.xml.
I recently came across this issue while extracting a very large gzipped tar file with the
ignore_zeros
option. About halfway through the archive there's a file with a data section that starts with blocks of nulls.When reading this file tar tries to read the first non-null content as a header and fails, setting the done flag in the entry iterator and finishing the read. https://github.com/alexcrichton/tar-rs/blob/c3e2cb848afea5954f485f593668e69e0106513e/src/archive.rs#L539-L549
I think it would be useful to allow the read to continue even when errors are encountered (either by default or with an option). I've tested with GNU tar and the
tarfile
module in Python and they both have this behavior by default and extract all the files. Removing the line that sets the done flag in the error case gives the desired result. With this change all the current tests pass and all the files are extracted the same as GNU tar but I'm not sure if it's incorrect in other cases.Here's a minimal archive to test with: nullfile.tar.gz The
ignore_zeros
flag must be set to extract everything, and the file with null contents isbJK/bJK5oTgxVJo.xml
.