101arrowz / fflate

High performance (de)compression in an 8kB package
https://101arrowz.github.io/fflate
MIT License
2.21k stars 77 forks source link

Unzip: Maximum call stack size exceeded #139

Closed solomax closed 2 years ago

solomax commented 2 years ago

Hello,

fflate version 0.7.3 (browser version)

the code as simple as:

import { Unzip, UnzipInflate, unzipSync } from 'fflate';

  function getFile() {
    const blob = _file.slice(0, _file.size);
    return blob.arrayBuffer();
  }

async test() {
  const files = []
    , data = new Uint8Array(await getFile())
    , unzipper = new Unzip(file => {
      if (!file.name.startsWith('prefix/') {
        const obj = {
          name: file.name
          , size: file.originalSize
        };
        files.push(obj);
      }
    });
  unzipper.register(UnzipInflate);
  unzipper.push(data, true);
}

_file is file selected by user

the code fails with some files created by https://www.libarchive.org/ (I'm asking my customer if it would be possible to share problem file with you)

The problem is only reproducible in production build According to my debug this is the problem line: https://github.com/101arrowz/fflate/blob/master/src/index.ts#L3202 recursion seems to be too deep

would appreciate any help with this issue :)

101arrowz commented 2 years ago

First thing: since you're pushing all the data in at once, using the streaming API isn't really necessary. You can get the same result with smaller bundle size and the same performance like this:

import { unzipSync } from 'fflate';

const files = [];

unzipSync(new Uint8Array(await blob.arrayBuffer()), {
  filter(file) {
    // Never decompress but still get metadata
    if (!file.name.startsWith('prefix/')) {
      files.push({ name: file.name, size: file.originalSize }); 
    }
    return false;
  }
});

If you really wanted to take advantage of the lower memory usage of the streaming API, you'd need to use File.prototype.stream to get a ReadableStream and then pipe that through fflate. But then again since you're not decompressing any data there's really no point to that either.

Second, I think I've seen this problem before but for some reason I can't find the prior bug report... Anyway, the fix for this is to stream in chunks no larger than 64kB - if there are a few thousand files within a single chunk you pass to unzipper.push, the issue will arise.

Let me know if the first code snippet fixes things.

solomax commented 2 years ago

Many thanks for the hint @101arrowz Works as expected!

kenorb commented 2 years ago

Streaming example code: unzipping downloaded zip.

kenorb commented 2 years ago

I hit that error as well with fflate@0.7.3:

RangeError: Maximum call stack size exceeded

when using the following code (similar to the posted above):

import {zip, unzip, unzipSync, AsyncUnzipInflate, Zip, Unzip, UnzipInflate} from "fflate";

const downloadFilesFromZip = async (url) => {
  console.log("Downloading from " + url + "...");
  const unzipper = new Unzip();
  unzipper.register(AsyncUnzipInflate);
  unzipper.onfile = (file) => {
    //console.log("Got", file.name);
    const rs = new ReadableStream({
      start(controller) {
        file.ondata = (err, dat, final) => {
          controller.enqueue(dat);
          if (final) controller.close();
        };
        file.start();
      },
    });
    createWriteStream(file.name, rs);
  };
  const res = await fetch(url);
  const reader = res.body.getReader();
  while (true) {
    const { value, done } = await reader.read();
    if (done) {
      unzipper.push(new Uint8Array(0), true);
      break;
    }
    unzipper.push(value);
  }
};

Called with (in my code, I'm fetching file from local):

downloadFilesFromZip('https://ftp.drupal.org/files/projects/drupal-8.9.20.zip');

The file got over 30k files, but when I used jszip, I didn't have any errors with this file (but I wasn't using streams).

Stack trace: Screenshot 2022-08-21 15:27:02

Repro steps: you can use the following project with 3b06f8d commit as a HEAD with DevTools opened.

kenorb commented 2 years ago

I've refactored the code using the mentioned example and it worked fine (without errors).

  async loadFiles() {
    var files = [];
    // GET request using fetch with set headers.
    let data = fetch("https://ftp.drupal.org/files/projects/drupal-8.9.20.zip", {
      "Content-Type": "application/zip",
      Encoding: "binary",
    })
      // Fetch a zip file.
      .then((response) => response.blob())
      // Load a zip file.
      .then((blob) =>
        blob.arrayBuffer().then((arr) => {
          // Create an object with list of file entries.
          var filesList = {};
          // Loads a zip data and read list of files.
          filesList = unzipSync(new Uint8Array(arr), {
            filter(file) {
              // Filter out unwanted files.
              if (!file.name.endsWith(".txt")) {
                return true;
              }
              return false;
            },
          });
          files = filesList;
        })
      )
      .catch((error) => console.log(error));
  }