Stuk / jszip

Create, read and edit .zip files with Javascript
https://stuk.github.io/jszip/
Other
9.74k stars 1.3k forks source link

jszip runs out of memory when creating a zip file with 20,000 files #446

Open dteviot opened 7 years ago

dteviot commented 7 years ago

Problem can be reproduced on Chrome, using the following files. Note, pkzip.js needs to be in same directory as this file.

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>Demonstrate jszip memory use</title>
    <base />
</head>
<body>
    <button id="prepZip">Prepare Zip</button>
    <button id="genZip">GenerateZip</button>
    <script src="jszip.js"></script>
<script>
"use strict";

class ZipTest {
    prepZip() {
        let content = "dummy";
        let zipOptions = { compression: "DEFLATE" };
        this.zipFile = new JSZip();
        for(let i = 0; i < 20000; ++i) {
            this.zipFile.file(`${i}.txt`, content, zipOptions);
        };
    }

    genZip() {
        return this.zipFile.generateAsync({ type: "blob" }).then(function (content) {;
            return Promise.resolve();
        })
    }

    init () {
        document.querySelector("#prepZip").onclick = this.prepZip.bind(this);
        document.querySelector("#genZip").onclick = this.genZip.bind(this);
    }
}

let test = new ZipTest();
test.init();
</script>
</body>
</html>

When above html file is first opened, memory shown in Task manager is 15 Megs. After clicing PrepareZip, and forcing a Garbage collection, Memory is 32 Megs. After clicking GenerateZip, memory climbs to more than 3 Gigs, then "Aw Snap" appears on Chrome.

The root cause seems to be generateWorker() in https://github.com/Stuk/jszip/blob/master/lib/generate/index.js This creates a compression worker for each file to be compressed, before starting to compress any file. Each worker contains 9 buffers that are 64 kbyte in length.

dduponchel commented 7 years ago

Testing with just 200 objects and generateInternalStream (which starts paused and doesn't generate any output) gives insane results: 0.73 MB (page load) -> 1.23MB (prepare) -> 56.93MB (generate)

Creating the workers on demand would be a better idea even if it still involves resource copying (we need to "freeze" the files). A quick fix would be to lazy-load the pako object: we create one per file and each instance has its own internal buffer (of 64k). Only one pako object is used at a time.

After this quick fix, I get: 0.86 MB (page load) -> 1.20MB (prepare) -> 1.98MB (generate)

For 20000 objects (still with generateInternalStream), I get: 0.86 MB (page load) -> 10.94MB (prepare) -> 113.68MB (generate)

I still see a lot of created objects and I'm not sure we need them all (but that's still way better). I'll prepare a patch.

dduponchel commented 7 years ago

The partial fix has been released in JSZip v3.1.4.