whatwg / compression

Compression Standard
https://compression.spec.whatwg.org/
Other
82 stars 21 forks source link

DecompressionStream should support pull-based decompression #66

Closed mstange closed 3 weeks ago

mstange commented 4 weeks ago

What is the issue with the Compression Standard?

I have a web app which consumes gzipped data. On large datasets, the decompressed size can be a gigabyte or more. I would like to minimize memory usage by never materializing the fully decompressed buffer into memory; instead, I would like to decompress in chunks and then process the decompressed chunks.

It seems that I cannot use DecompressionStream to achieve this at the moment. As specced, when the compressed data comes from the network, the browser has to decompress it all and enqueue the decompressed chunks. I would prefer to have the browser only hold on to the compressed memory until I ask for the next decompressed chunk.

saschanaz commented 4 weeks ago

As specced, when the compressed data comes from the network, the browser has to decompress it all and enqueue the decompressed chunks.

Can you show me how you are using the stream? TransformStreams are not supposed to consume greedily and has highWaterMark=1, meaning it's only allowed to have one single unconsumed chunk and then stops processing until it's read.

With the following demo it only reads the first chunk and wait.

// A string "HELLO"
let buffer = [
  new Uint8Array([31,139,8,0,0]),
  new Uint8Array([0,0,0,0,10]),
  new Uint8Array([243,112,245,241,241]),
  new Uint8Array([7,0,54,100,68,193,5,0,0,0]),
];
let index = 0;
let r = new ReadableStream({
  pull(controller) {
    if (index >= buffer.length) {
      controller.close();
      return;
    }
    console.log("pulled", index);
    controller.enqueue(buffer[index]);
    ++index;
  },
  type: "bytes"
});
let d = new DecompressionStream("gzip");
let pipe = r.pipeThrough(d);

Please correct me if I misunderstood your issue.

mstange commented 4 weeks ago

Ah, that means I misunderstood how it works and need to do some more debugging. Thanks for taking a look!