MiloszKrajewski / K4os.Compression.LZ4

LZ4/LH4HC compression for .NET Standard 1.6/2.0 (formerly known as lz4net)
MIT License
675 stars 77 forks source link

can't decode #59

Closed iml6yu closed 3 years ago

iml6yu commented 3 years ago

Description a string of Json Formate can't decode when i set byte[] lenth .

To reproduce Steps to reproduce the behavior:

 var s = "{\"ID\":\"59fd4cb7-dd79-4f9d-953a-ac91dc0b00f9\",\"Key\":\"init\",\"Step\":2147483647,\"Datas\":null}";             var srcList = new List<byte>(); 
            srcList.AddRange(System.Text.Encoding.UTF8.GetBytes(s));
            var src = srcList.ToArray();
            var target = new byte[LZ4Codec.MaximumOutputSize(src.Length) ];

            LZ4Codec.Encode(src, 0, src.Length, target, 0, target.Length, LZ4Level.L03_HC);

            var rarray = new byte[target.Length];
            Array.Copy(target, rarray, rarray.Length);

            var r = new byte[src.Length];
            var rbuffer = LZ4Codec.Decode(rarray, r);

            var rs = System.Text.Encoding.UTF8.GetString(r);
            Assert.IsTrue(rs == s);

Expected behavior 希望解压以后和s相等

Actual behavior 实际解决后的结果是 空字符串。 Environment

Additional context 在压缩和解压非json字符串的时候,以上代码是能够通过的,但是当压缩和解决json格式的时候就不可以了,解压以后 的rs 是string.Empty;

iml6yu commented 3 years ago

补充一下: 当我使用Lz4Picker类进行操作的时候是完全OK的,只是速度略慢,我需要更快的速度。

MiloszKrajewski commented 3 years ago

Unfortunately I don't read Chinese, but pickler is the simples approach:

var s = "{\"ID\":\"59fd4cb7-dd79-4f9d-953a-ac91dc0b00f9\",\"Key\":\"init\",\"Step\":2147483647,\"Datas\":null}";
var e = LZ4Pickler.Pickle(Encoding.UTF8.GetBytes(s));
var t = Encoding.UTF8.GetString(LZ4Pickler.Unpickle(e));

Debug.Assert(s == t);

If you really need to use low level functions, you need to do all the work around it. For example, I can see that you are not using value returned by LZ4Codec.Encode(...). Well, value tells you how many bytes were used in compression, which is the value you need to use when doing LZ4Coded.Decode(...)

BTW, using GetBytes, GetString, new byte[] puts strain on GC. Try using IBufferWriter<byte> as much as possible (Pickler works with it).

iml6yu commented 3 years ago

Thank you . My English is very poor. so, I can't describe my problem clearly.

When using the LZ4Codec class to compress and decompress JSON strings, if the Length of the solution byte is set to the Length of the source byte[], it will not be decompressed. If the Length is set to Source.Length * 255, it will be decompressed correctly.

e.g.

//this is source string
            var s = "{\"ID\":\"59fd4cb7-dd79-4f9d-953a-ac91dc0b00f9\",\"Key\":\"init\",\"Step\":2147483647,\"Datas\":null}";             var srcList = new List<byte>(); 
            srcList.AddRange(System.Text.Encoding.UTF8.GetBytes(s));

//Get src byte[]
            var src = srcList.ToArray();
//Init encode byte[]
            var target = new byte[LZ4Codec.MaximumOutputSize(src.Length) ];

            LZ4Codec.Encode(src, 0, src.Length, target, 0, target.Length, LZ4Level.L03_HC);

//Data copy  
            var rarray = new byte[target.Length];
            Array.Copy(target, rarray, rarray.Length);

//init  decode byte[] 
            var r = new byte[src.Length];
            LZ4Codec.Decode(rarray, r);
// result string 
            var rs = System.Text.Encoding.UTF8.GetString(r);
//this is false rs != s,  
// But, if  s="abcdefg",this is true, rs == s
            Assert.IsTrue(rs == s);
iml6yu commented 3 years ago

Pickler very good, it's so great.

MiloszKrajewski commented 3 years ago

Your rarray has maximum bytes not actual bytes. Take actualLength = LZ4Codec.Encode(...); and use it when creating var rarray = new byte[actualLength];

BUT...

your code is far from optimal. You are making many unnecesary allocation (for example: srcList.AddRange(System.Text.Encoding.UTF8.GetBytes(s)); var src = srcList.ToArray();). Stick to my previous example:

var s = "{\"ID\":\"59fd4cb7-dd79-4f9d-953a-ac91dc0b00f9\",\"Key\":\"init\",\"Step\":2147483647,\"Datas\":null}";
var e = LZ4Pickler.Pickle(Encoding.UTF8.GetBytes(s));
var t = Encoding.UTF8.GetString(LZ4Pickler.Unpickle(e));