101arrowz / fflate

High performance (de)compression in an 8kB package
https://101arrowz.github.io/fflate
MIT License
2.27k stars 79 forks source link

Streams onData do not work as expected anymore #188

Closed MoeKasp closed 8 months ago

MoeKasp commented 1 year ago

How to reproduce

    const compressor = new Zlib();
    const data = "test";
    let result;
    compressor.ondata = (data) => {
      result = data;
    };
    this.compressor.push(strToU8(data));

    return result;

result will be still undefined as the ondata callback never runs.

The problem In v0.7.x the on data function would have been executed as soon as you would push. In v0.8.x it only runs if you pass true as the second argument. We use the intermittent chunks and we would appreciate a way to use them again.

I believe this change in behaviour is also undocumented.

101arrowz commented 1 year ago

This is an intentional change. The compressor used to accept very tiny chunks which yielded very bad compression ratios. Now it collects your data into chunks from 8kB to 64kB to compress. 8kB is a small enough size that for most use cases (writing the compressed data to a filesystem/uploading to the web/etc.) waiting until you've pushed that much data adds no additional latency. What's your use case where you need the compressed data immediately?

MoeKasp commented 1 year ago

Thanks for the quick reply. We are sending compressed messages via a network connection. The server expects us to just send a Zlib compressed stream. So every time we send a message, we run in through the compressor and send the chunk. The server decompresses the chunk and handles everything else.

101arrowz commented 1 year ago

OK, you probably need a flush() method (which forcibly emits all the data in the buffer). For now you can stick with v0.7, I'll add that method in the next version.

MoeKasp commented 1 year ago

awesome! Thank you 👍

101arrowz commented 8 months ago

Added a flush() in v0.8.2 to all the streams, let me know if you have any issues!

MoeKasp commented 8 months ago

Hi @101arrowz , sorry to bother you again. We just now came to trying v0.8.2.

But it still does not seem to work like v0.7.

import {Zlib, strFromU8, strToU8} from 'fflate';

export class Compressor {
  private compressor: Zlib | null;

  public compress(uncompressedData: string): Uint8Array {
    if (!this.compressor) {
      this.compressor = new Zlib();
    }
    let result;
    this.compressor.ondata = (data) => {
      result = data;
    };
    this.compressor.push(strToU8(uncompressedData));
    // v0.8.2 code just added the flush
    this.compressor.flush();

    return result;
  } 
}

We would expect it to pass this test, as the v0.7 did pass it too.

import {Compressor} from './Compressor';
import {describe, it, expect} from 'vitest';

describe('Compressor', () => {
  const sut = new Compressor();
  const compressedWithHeader = new Uint8Array([120, 156, 0, 4, 0, 251, 255, 84, 101, 115, 116]);
  const compressedWithoutHeader = new Uint8Array([0, 4, 0, 251, 255, 84, 101, 115, 116]);

  it('compress() should return correct value', () => {
    const compressedResult1 = sut.compress('Test');
    const compressedResult2 = sut.compress('Test');
    expect(compressedResult1).toEqual(compressedWithHeader);
    expect(compressedResult2).toEqual(compressedWithoutHeader);
  });
});

unfortunatly we get these Uint8Arrays:

Compressed data: Uint8Array(7) [
  120, 156, 10, 73,
   45,  46,  1
]

Compressed data: Uint8Array(3) [ 8, 132, 1 ]

101arrowz commented 8 months ago

If you try decompressing the output, you'll see that it's actually still correct, and Zlib correctly decompresses each chunk without waiting for more data. fflate is now actually producing much smaller outputs when flushing than the standard streams in v0.7 did, so you shouldn't need to make any changes!

I do have some recommendations though. I would not recommend examining compressed outputs of fflate in your unit tests. There is no guarantee that fflate will produce the same binary output from version to version when compressing a given input. I would instead try decompressing things with Unzlib in your unit tests to make sure you're getting each chunk in full every time.

Also, there is no longer a guarantee that fflate produces one output chunk per input chunk, so you need to concatenate the buffers before returning. It just happens that in this case fflate is still only producing one chunk per input, but it may not in the future, so I suggest you change to this:

import {Zlib, strFromU8, strToU8} from 'fflate';

// concatenates output buffers
const concat = (buffers: Uint8Array[]) => {
  if (buffers.length === 1) return buffers[0];

  const out = new Uint8Array(buffers.reduce((size, chk) => size + chk.length, 0));
  for (let i = 0, off = 0; i  < buffers.length; ++i) {
    out.set(buffers[i], off);
    off += buffers[i].length;
  }

  return out;
};

export class Compressor {
  private compressor: Zlib | null;

  public compress(uncompressedData: string): Uint8Array {
    if (!this.compressor) {
      this.compressor = new Zlib();
    }
    let results = [];
    this.compressor.ondata = (data) => {
      results.push(data);
    };
    this.compressor.push(strToU8(uncompressedData));
    // v0.8.2 code just added the flush
    this.compressor.flush();

    return concat(results);
  } 
}
MoeKasp commented 8 months ago

Thank you! You are right! We modified the unit tests and are now using the newest version! You helped us a lot! 👍