101arrowz / fflate

High performance (de)compression in an 8kB package
https://101arrowz.github.io/fflate
MIT License
2.27k stars 79 forks source link

support `DecompressionStream` when available #152

Closed ThaUnknown closed 1 year ago

ThaUnknown commented 1 year ago

What can't you do right now?

Take advantage of native decompression streams

An optimal solution

detect support for DecompressionStream and use it, fallback to default if not

(How) is this done by other libraries?

to my knowledge only zip-go supports DecompressionStreams and doesn't fall back to anything.

DecompressionStream only supports deflate and gzip, but it might be worth taking advantage of

101arrowz commented 1 year ago

Actually it would probably be better to make a polyfill for CompressionStream and DecompressionStream backed by fflate than to add support for it here. I can try to work on that in the coming days.

ThaUnknown commented 1 year ago

Actually it would probably be better to make a polyfill for CompressionStream and DecompressionStream backed by fflate than to add support for it here. I can try to work on that in the coming days.

this exposes a better API for files than those streams, so I think just accelerating the lib with them would be better but why not both?

101arrowz commented 1 year ago

Actually after profiling the performance, it actually tends to be faster just to use fflate in most cases due to streaming overheads. However if anyone wants a Compression Streams API polyfill using fflate, I've made one available at https://github.com/101arrowz/compression-streams-polyfill.

manzt commented 10 months ago

Just having a look at DecompressionStream in Deno.

import { assert } from "https://deno.land/std@0.210.0/assert/mod.ts";
import * as fflate from "npm:fflate";
import * as pako from "npm:pako";

async function decode_stream(stream: ReadableStream) {
  const reader = stream
    .pipeThrough(new DecompressionStream("gzip"))
    .getReader();
  let bytes: Uint8Array;
  {
    const result = await reader.read();
    assert(!result.done, "ReadableStream must have data");
    bytes = result.value;
  }
  {
    const result = await reader.read();
    assert(result.done, "ReadableStream must have no more data");
  }
  return bytes;
}

const base = new URL(
  "https://raw.githubusercontent.com/zarr-developers/zarr_implementations/5dc998ac72/examples/zarr.zr/gzip/.zarray",
);

const BYTES = await fetch(new URL("0.0.0", base))
  .then((r) => r.arrayBuffer())
  .then((b) => new Uint8Array(b));

const REFERENCE = fflate.gunzipSync(BYTES);

Deno.bench("decode_stream", async () => {
  const stream = new ReadableStream({
    start(controller) {
      controller.enqueue(BYTES);
      controller.close();
    },
  });
  const result = await decode_stream(stream);
  assert(result.length === REFERENCE.length);
});

Deno.bench("fflate.gunzip", () => {
  const result = fflate.gunzipSync(BYTES);
  assert(result.length === REFERENCE.length);
});

Deno.bench("pako.inflate", () => {
  const result = pako.inflate(BYTES);
  assert(result.length === REFERENCE.length);
});
> deno bench -A gzip.ts
cpu: Apple M3 Max
runtime: deno 1.39.1 (aarch64-apple-darwin)

file:///Users/manzt/demos/gzip.ts
benchmark          time (avg)        iter/s             (min … max)       p75       p99      p995
------------------------------------------------------------------- -----------------------------
decode_stream      40.66 µs/iter      24,591.2    (36.96 µs … 1.37 ms)     41 µs  62.92 µs  88.58 µs
fflate.gunzip      67.92 µs/iter      14,723.0    (63.83 µs … 1.68 ms)  66.62 µs  82.67 µs  88.25 µs
pako.inflate      112.22 µs/iter       8,910.9    (102.83 µs … 1.4 ms) 109.54 µs 139.58 µs 182.42 µs

summary
  decode_stream
   1.67x faster than fflate.gunzip
   2.76x faster than pako.inflate
ThaUnknown commented 10 months ago

I don't think that's a fair comparison as you're wrapping fflate in a ReadableStream, realistically you wouldn't use readable streams for this

manzt commented 10 months ago

I’m not sure i understand. fflate is decompressing BYTES directly, there is no readable stream in an fflate code path. Only the non-fflate bench (decode_stream) creates the readable stream.

Might be worth looking into Deno.bench.

ThaUnknown commented 10 months ago

oh sorry, missread