Closed ronag closed 7 years ago
Hi,
lz-string
is based on LZW, which includes a dictionary compression stage (LZ78), and possibly follows that by an entropy coder like Huffman or arithmetic coding (I'm not sure). lz-utf8
, however, only applies a dictionary coder equivalent to LZ77. One of the characteristics of LZ77 (compared to LZ78 + entropy) is that its compression ratio is lower but its decompression is extremely fast. Overall the library and choice of algorithm is optimized for speed, not size, (and for creating a binary format that can naturally accept plain utf-8 strings).
For very short inputs, lz-utf8
may not be as effective as lz-string
when regards to size. Try comparing with longer strings.
The timing difference, which varies between browsers is a bit misleading. The reason this short example appears to run faster in lz-string
, I believe, is probably due to initialization time (creation of data structures, lookup tables etc.), since the string is so short the compression time itself may be negligible. Also it may be the resolution of the timer I'm using in the demo is higher.
Anyway, with longer strings, like the first 1MB of the English bible, I get:
lz-utf8
compression: 978453 to 367357 bytes (37.5%) in 149.5ms (6.5MB/s)lz-utf8
decompression: 9.7ms (100.7MB/s)lz-string
compression: 978453 to 163603 (8%) in 703mslz-string
decompression: 78 msSo here lz-utf8
is 4.6 times faster during compression and 8 times faster during decompression, though the compression efficiency is significantly lower.
Results would also vary between inputs, of course.
In any case, since the binary format is frozen, the low compression efficiency cannot be improved. In general, it was a conscious an intentional decision to strongly optimize for speed. Any changes to the choice algorithm would must, in practice, result in the creation of a completely different format and library. I don't really have any plans for that at the moment.
Just did a comparison between lzutf8 and lz-string using:
https://rotemdan.github.io/lzutf8/demo/ http://pieroxy.net/blog/pages/lz-string/demo.html
Using the input
I get: