101arrowz / fflate

High performance (de)compression in an 8kB package
https://101arrowz.github.io/fflate
MIT License
2.27k stars 79 forks source link

JS file #97

Closed photopea closed 2 years ago

photopea commented 3 years ago

Hi, could you provide a JS file of this library? Is this library open-source? I do not use NPM.

101arrowz commented 3 years ago

Yes, the library is open-source. The source code is written in TypeScript, which is like JavaScript but with strong typing enforced at compile time. TypeScript compiles to JavaScript files that are distributed via NPM but can be downloaded without installing NPM if you wish.

If you want a global variable fflate to contain the library (like you've done in UZIP), you can download the file at https://cdn.jsdelivr.net/npm/fflate/umd/index.js. However please note that the size benefits of this library can't be achieved like this because the online file contains the entire codebase, whereas if you use NPM you can install only the parts of the JS file that you need. Basically this reduces the size from 30kB minified/11kB gzipped to 12kB minified/6kB gzipped for compression + decompression of DEFLATE, ZLIB, and ZIP (like in UZIP). If you want this smaller size, let me know and I can provide you a custom bundle or give you instructions on how to make one.

101arrowz commented 3 years ago

@photopea Did that link work for you?

photopea commented 3 years ago

I tried to inflate a 192 MB file.

fflate.unzlibSync took 6075 ms pako.inflateRaw took 5563 ms UZIP.inflateRaw took 4379 ms

Is there a method in "fflate" similar to "inflateRaw", which does not check the CRC, so that it could be faster? Also, could you provide the same interface ask pako and UZIP, so that we can just replace "pako" with "fflate" in the code to switch to your library?

UZIP is faster probably because it accpets the output target array as the input, and it believes you that the result will fit into it. No size checks or reallocations are needed during the decoding.

101arrowz commented 3 years ago

Actually unzlibSync ignores the CRC, could you send this file with bad performance so I can investigate? Also you can do the same with fflate, pass the output buffer as an argument and performance improves.

photopea commented 3 years ago

It is about 5% faster when I add the output as a second agrugment. I was not able to find any description of unzlibSync , do you have a manual somewhere?

It happened for a larege PSD file, which contains ZLIB streams inside. Raster image (a photo) is being compressed, the output is about 2x smaller than the input. I think fflate should be slower even for other similar cases.

101arrowz commented 3 years ago

There are a few files for which fflate decompresses slower than uzip but usually only by 3-5%, and I've never found it decompress slower than pako. I can debug performance issues if you provide the Zlib stream as a binary file, otherwise I probably can't do much.

The actual cause of the performance difference is probably bounds checking: in UZIP will freeze on an invalid stream whereas Pako and fflate throw an error. By removing bounds checking fflate becomes faster than UZIP for the files it is usually slower on.

The fact that Pako is faster than fflate is a bit concerning, that has literally never happened for me, so I would like to reproduce it locally if possible.

photopea commented 3 years ago

Ok, here it is data.txt

photopea commented 3 years ago

This is the input for pako.inflateRaw and UZIP.inflateRaw.

101arrowz commented 3 years ago

On my computer, fflate.inflateSync decompresses that file in ~270ms average, UZIP.inflateRaw decompresses in ~260ms average, pako.inflateRaw decompresses in ~300ms average. If fflate is slower than Pako on your machine for this file, I'm interested to know what CPU and platform you're using. I'm running on Node.js in WSL with an i7-8650U laptop.

photopea commented 3 years ago

Can you run it 20 times in a row? Maybe a browser optimizes the code, if it is executed a lot.

101arrowz commented 3 years ago

I took the average of 30 runs for those measurements. The cold start time is the main difference maker for the averages: Pako has pretty poor performance before JIT optimization (~450ms for the first few iterations) but becomes around 270ms after. fflate starts at ~350ms and goes to 250ms, UZIP starts at 300ms and goes to 250ms.

photopea commented 3 years ago

I mean measuring a total time for 20 runs. That is what my program does. It performs a decompression of about 20 parts (20x7 MB), I want to make the total time as low as possible.

101arrowz commented 3 years ago

Well to get the total time you can multiply average by number of runs. So for Pako its about 6000ms, fflate is 5400ms, UZIP is 5200ms.

photopea commented 3 years ago

The first run takes longer than the 20th run. After several runs, next runs take the identical time. So it is not clear what you mean by "average". In my case, I was testing a total time for the first 20 runs.

101arrowz commented 3 years ago

Average is the mean, i.e. the total time for all runs divided by the number of runs. So multiplying the mean by the number of runs gives the total time for all runs.

101arrowz commented 2 years ago

Closed because I'm not sure if there's anything actionable on this issue but let me know if you have further questions.