archiverjs / node-zip-stream

a streaming zip archive generator
http://archiverjs.com/zip-stream
MIT License
154 stars 22 forks source link

Corrupt ZIP archive when streaming in 200+ files #131

Open nick-george opened 2 years ago

nick-george commented 2 years ago

Hi there,

Note that I've created this same issue on the "archiver" project here: https://github.com/archiverjs/node-archiver/issues/602

I strongly suspect that the issue discussed below is an issue with this library as opposed to "archiver". I have tried using its "TAR" output and am not experiencing any issues.

I've been troubleshooting an issue where archiver appears to be generating corrupt archives. Using version 5.3.1 on node v16.13.1.

We're streaming in files that have been retrieved from ssh2-sftp-client.

This library seems to work fine with very large archives built from a few large files. It also seems to work fine for archives up to 199 files. However, when I have 200 files or more, the archive gets corrupted. By diffing the hexdumps of one archive that has 199 files and another that has 200, I can see the archive with 200 files is missing the "End of central directory record" (EOCD). See below for the bytes that are missing from my archive with 200 files (note the first four bytes below are the last part of the last filename in the archive).

0039b9b0 65 2e 70 70 50 4b 05 06 00 00 00 00 c6 00 c6 00 |e.ppPK..........| 0039b9c0 f4 3c 00 00 c0 7c 39 00 00 00 |.<...|9...| 0039b9ca Otherwise, the generated files are pretty much identical (except for one less file being present in the "good" archive).

Are you aware of any file count limit for this library?

Many thanks, Nick

nick-george commented 2 years ago

Oh whoops, I see this project and archiver are part of the same organisation. Apologies for the duplication. I'm happy to close this ticket in whatever repo you think is less appropriate.

Veragin commented 1 year ago

Hi I have still the same problem, if I am creating zip archive with lots of small files, very often it generates corrupted archive

some data are missing, cause i compared the size of the corrupted zip with the correct one (i got luckily from successful run)

I was trying to zip around 9000 images and i was missing like 700kb of 50MB of the final zip archive ... but even only (600 images => 4MB) did not work

The corrupted zip file is openable with 7zip application on windows, but others software cant handle the zip

EDIT: Found cause of the problem, zip.finish() is probably async somewhere inside so it needs some time to write the data if I wait like 5s after the finish() is called everything is fine

doriancollier commented 6 months ago

Found cause of the problem, zip.finish() is probably async somewhere inside so it needs some time to write the data if I wait like 5s after the finish() is called everything is fine

I had the same issue, and adding a 5s wait after finish() did help.

I kept experimenting and found out that the file isn't fully written until the output "close" event fires, which happens after finish(). To handle this, I wrapped all of my zip code in a promise that resolves after the "close" event.

Here's my final function...

`export async function zipDirectory( directoryToZip: string, zipFilePath: string, ): Promise { const archive = archiver('zip', { zlib: { level: 9 } }); const tempRootDir = await files.getTempRootDir();

const finalDirectoryToZip = `${tempRootDir}/${directoryToZip}`;
const finalZipFilePath = `${tempRootDir}/${zipFilePath}`;
const output = fs.createWriteStream(finalZipFilePath);

return new Promise((resolve, reject) => {
    output.on('close', () => {
        resolve(zipFilePath); // Resolve with the zipFilePath
    });

    output.on('end', () => {
        //logger.trace('Data has been drained');
    });

    archive.on('warning', (err) => {
        if (err.code === 'ENOENT') {
            logger.warn(`File not found: ${err}`);
        } else {
            logger.error(`Archiver warning: ${err}`);
            reject(err);
        }
    });

    archive.on('error', (err) => {
        logger.error(`Archiver error: ${err}`);
        reject(err);
    });

    archive.pipe(output);
    archive.directory(finalDirectoryToZip, false);
    archive.finalize();
});

} `