mafintosh / tar-stream

tar-stream is a streaming tar parser and generator.
MIT License
406 stars 92 forks source link

Overwrite header.size with pax header size, if present #78

Closed diamondap closed 6 years ago

diamondap commented 6 years ago

This is a fix for https://github.com/mafintosh/tar-stream/issues/75.

To add a file larger than 8GB into a tar archive, you need to set the size in a pax header, like this:

var inputfile = '/path/to/some/huge/file';
var stats = fs.statSync(tmpFile.name);
var header = {
    name: inputfile,
    size: stats.size,
    mode: stats.mode,
    uid: stats.uid,
    gid: stats.gid,
    mtime: stats.mtime
};
// pax headers allow us to include files over 8GB in size
header.pax = {
    size: stats.size
};

Now pass that header into pack.entry()

The only problem with the existing code was that it wasn't overwriting header.size with header.pax.size. See the doc for the size attribute of the extended pax header at http://pubs.opengroup.org/onlinepubs/009695299/utilities/pax.html#tag_04_100_13_03.

Before this fix, when given a 9GB file, tar-stream would write all 9GB into the tar file, but write the header size as 8589934591 (8GB or octal 77777777777). That resulted in a corrupt tar file.

All existing tests pass. I don't really want to add a 9GB test fixture, but it's working fine for 9GB files.

piranna commented 6 years ago

Maybe you could add a 9GB zeroed file compressed using gunzip, and decompress it when running the tests (and later remove it). This way you could be able to add a test for this use case.

diamondap commented 6 years ago

I added a test in test/extract.js to untar an archive that contains an 8.2 GB file. The new fixture is gzipped down to about 8MB. The test unzips it to a stream, so it doesn't use any extra disk space during the test run.

piranna commented 6 years ago

The test unzips it to a stream, so it doesn't use any extra disk space during the test run.

Nice trick! :-D Reading the test code I though you were creating the file on the fly, that would have been a good idea too :-) Good job! :-D

diamondap commented 6 years ago

My initial report said that pack was creating a corrupt tar file when you added a file larger than 8GB. Just to clarify, pack.js was working correctly all along. extract.js just needed a minor fix to read the correct size from the pax header.

mafintosh commented 6 years ago

Hey! Really good stuff. Just letting you know I had to move the "huge" to it's own test file "slow.js" that is not run per default (using the test-all script it is), because it was slowing down the tests so much people started thinking it was broken.

mafintosh commented 6 years ago

Love the tests though! Keep it up :)

diamondap commented 6 years ago

It makes sense to move that. The test will be very slow on some systems.