Closed photopea closed 2 years ago
Yes, the library is open-source. The source code is written in TypeScript, which is like JavaScript but with strong typing enforced at compile time. TypeScript compiles to JavaScript files that are distributed via NPM but can be downloaded without installing NPM if you wish.
If you want a global variable fflate
to contain the library (like you've done in UZIP), you can download the file at https://cdn.jsdelivr.net/npm/fflate/umd/index.js. However please note that the size benefits of this library can't be achieved like this because the online file contains the entire codebase, whereas if you use NPM you can install only the parts of the JS file that you need. Basically this reduces the size from 30kB minified/11kB gzipped to 12kB minified/6kB gzipped for compression + decompression of DEFLATE, ZLIB, and ZIP (like in UZIP). If you want this smaller size, let me know and I can provide you a custom bundle or give you instructions on how to make one.
@photopea Did that link work for you?
I tried to inflate a 192 MB file.
fflate.unzlibSync took 6075 ms pako.inflateRaw took 5563 ms UZIP.inflateRaw took 4379 ms
Is there a method in "fflate" similar to "inflateRaw", which does not check the CRC, so that it could be faster? Also, could you provide the same interface ask pako and UZIP, so that we can just replace "pako" with "fflate" in the code to switch to your library?
UZIP is faster probably because it accpets the output target array as the input, and it believes you that the result will fit into it. No size checks or reallocations are needed during the decoding.
Actually unzlibSync ignores the CRC, could you send this file with bad performance so I can investigate? Also you can do the same with fflate
, pass the output buffer as an argument and performance improves.
It is about 5% faster when I add the output as a second agrugment. I was not able to find any description of unzlibSync , do you have a manual somewhere?
It happened for a larege PSD file, which contains ZLIB streams inside. Raster image (a photo) is being compressed, the output is about 2x smaller than the input. I think fflate should be slower even for other similar cases.
There are a few files for which fflate
decompresses slower than uzip
but usually only by 3-5%, and I've never found it decompress slower than pako
. I can debug performance issues if you provide the Zlib stream as a binary file, otherwise I probably can't do much.
The actual cause of the performance difference is probably bounds checking: in UZIP
will freeze on an invalid stream whereas Pako and fflate
throw an error. By removing bounds checking fflate
becomes faster than UZIP
for the files it is usually slower on.
The fact that Pako is faster than fflate
is a bit concerning, that has literally never happened for me, so I would like to reproduce it locally if possible.
This is the input for pako.inflateRaw and UZIP.inflateRaw.
On my computer, fflate.inflateSync
decompresses that file in ~270ms average, UZIP.inflateRaw
decompresses in ~260ms average, pako.inflateRaw
decompresses in ~300ms average. If fflate
is slower than Pako on your machine for this file, I'm interested to know what CPU and platform you're using. I'm running on Node.js in WSL with an i7-8650U laptop.
Can you run it 20 times in a row? Maybe a browser optimizes the code, if it is executed a lot.
I took the average of 30 runs for those measurements. The cold start time is the main difference maker for the averages: Pako has pretty poor performance before JIT optimization (~450ms for the first few iterations) but becomes around 270ms after. fflate
starts at ~350ms and goes to 250ms, UZIP starts at 300ms and goes to 250ms.
I mean measuring a total time for 20 runs. That is what my program does. It performs a decompression of about 20 parts (20x7 MB), I want to make the total time as low as possible.
Well to get the total time you can multiply average by number of runs. So for Pako its about 6000ms, fflate
is 5400ms, UZIP is 5200ms.
The first run takes longer than the 20th run. After several runs, next runs take the identical time. So it is not clear what you mean by "average". In my case, I was testing a total time for the first 20 runs.
Average is the mean, i.e. the total time for all runs divided by the number of runs. So multiplying the mean by the number of runs gives the total time for all runs.
Closed because I'm not sure if there's anything actionable on this issue but let me know if you have further questions.
Hi, could you provide a JS file of this library? Is this library open-source? I do not use NPM.