pieroxy / lz-string

LZ-based compression algorithm for JavaScript
MIT License
4.13k stars 569 forks source link

How safe is lz-string? #130

Open caracal7 opened 5 years ago

caracal7 commented 5 years ago

if I use server-side something like

let obj = lzString.decompressFromUint8Array( new Uint8Array( userData );

can it broke on random or broken data?

JobLeonard commented 5 years ago

You need to be more clear about "safe" and "breaking" here, as well as which version you are using. But TL;DR: no, not that we know of, unless you count breaking due to running out of memory on large objects. That is not exactly something that LZString can be held responsible for though.

For example, the 1.34 version throws every substring into one large dictionary. So if you throw a large enough string at it (I'm talking millions of characters), it will start to stutter simply because JS objects aren't made to hold millions of keys.

The 2.0RC version builds a trie of small objects instead. You risk running out of memory, but each individual node has at most 65k chars (because that is the number of char codes we can get out of UTF16) smaller so shouldn't have this issue.

If you use the 2.0 unsafe version, well, it says unsafe for a reason. It is trivial to build a string that freezes that one (not crash, just freeze), namely one that iterates over all 65k charcodes in sequence.

caracal7 commented 5 years ago

I interesting whats happened if I will try to decompress random Uint8Array. Can random data broke lzString.decompressFromUint8Array execution?

JobLeonard commented 5 years ago

It will most likely return an empty string as it bails out, and in the unlikely worst case give you a garbage string instead. Hypothetically a null can also be returned, if you return try to decompress an empty array.

decompressFromUint8Array just treats pairs of 8-bit values from the Uint8Array as 16-bit UTF16 char codes, converts the array to a string that way, and then calls the regular decompression algorithm after that:

  //decompress from uint8array (UCS-2 big endian format)
  decompressFromUint8Array:function (compressed) {
    if (compressed===null || compressed===undefined){
        return LZString.decompress(compressed);
    } else {
        var buf=new Array(compressed.length/2); // 2 bytes per character
        for (var i=0, TotalLen=buf.length; i<TotalLen; i++) {
          buf[i]=compressed[i*2]*256+compressed[i*2+1];
        }

        var result = [];
        buf.forEach(function (c) {
          result.push(f(c));
        });
        return LZString.decompress(result.join(''));

    }
  decompress: function (compressed) {
    if (compressed == null) return "";
    if (compressed == "") return null;
    return LZString._decompress(compressed.length, 32768, function(index) { return compressed.charCodeAt(index); });
  },

Try it for yourself in your console:

randomString = (() => {let s = ''; for(let i = 0; i < 100; i++) s += String.fromCharCode(Math.random()*0x10000|0); return s})();
LZString.decompress(randomString); // Assumes you loaded LZString

This will almost always return ""