stangelandcl / LZ4Sharp

Unmaintained port of LZ4 Compression algorithm to C#
Other
61 stars 15 forks source link

Data loss during text (de?)compression (output smaller than input) #6

Closed sblive closed 12 years ago

sblive commented 12 years ago

Great port! Found a big problem though: I tried many different JSON strings (~300MB) and found at least one, where it seems the last block is missing after decompression (the output is much shorter than the input, ~300 vs. 350k), so it is unusuable. Tried the latest version from ~5h ago using the Debug version (i.e. now non generated). Cannot test generated version, but did test "multithreaded win32" version from original LZ4 homepage and it works! So the problem is in the port.

input: http://snipurl.com/24vzsc1 output http://snipurl.com/24vzu01

Please help!

Thanks!

P.S.: some benchmarks on my machine (all JSON, Vista x64, Sandy Bridge @ 4,6GHz):

Name;%;In;Out;C MB/s;D MB/s; GZIP ModelleStammdaten;10.82%;292950;31708;39.35;69.84; LZF ModelleStammdaten;17.66%;292950;51749;93.13;99.78; LZ4 ModelleStammdaten;15.84%;292950;46402;253.98;155.21; QLZ1 ModelleStammdaten;13.42%;292950;39315;103.47;84.66; QLZ3 ModelleStammdaten;11.30%;292950;33094;14.78;103.47; GZIP ModelleETInfo;9.44%;240525;22700;53.34;120.73; LZF ModelleETInfo;15.01%;240525;36098;99.73;134.93; LZ4 ModelleETInfo;14.72%;240525;35401;286.73;286.73; QLZ1 ModelleETInfo;12.29%;240525;29558;109.23;127.43; QLZ3 ModelleETInfo;10.20%;240525;24545;14.61;163.84; GZIP Artikelersetzung;12.59%;2303948;290043;60.53;118.77; LZF Artikelersetzung;20.64%;2303948;475465;88.96;100.79; LZ4 Artikelersetzung;18.20%;2303948;419282;278.13;296.92; QLZ1 Artikelersetzung;14.63%;2303948;337132;122.75;130.01; QLZ3 Artikelersetzung;14.41%;2303948;332008;11.67;104.63; GZIP ModelleFahrzeug;8.48%;20268197;1718578;69.83;121.41; LZF ModelleFahrzeug;15.03%;20268197;3046920;107.21;140.99; LZ4 ModelleFahrzeug;12.35%;20268197;2502983;274.95;315.84; QLZ1 ModelleFahrzeug;9.87%;20268197;2001167;129.55;145.01; QLZ3 ModelleFahrzeug;8.52%;20268197;1726858;14.61;164.64; GZIP Artikel;7.38%;123633224;9128337;75.54;130.96; LZF Artikel;12.65%;123633224;15634502;106.13;146.03; LZ4 Artikel;8.74%;123633224;10806478;296.62;357.29; QLZ1 Artikel;7.47%;123633224;9240482;133.97;149.27; QLZ3 Artikel;6.90%;123633224;8536067;15.83;147.73;

sblive commented 12 years ago

code used:

byte[] compressLZ4(string s) { return LZ4CompressorFactory.CreateNew().Compress(Encoding.UTF8.GetBytes(s)); } string decompressLZ4(byte[] s) { return Encoding.UTF8.GetString(LZ4DecompressorFactory.CreateNew().Decompress(s)); }

stangelandcl commented 12 years ago

See comment on closed issue number 3 for an explanation. I think this is fixed. I couldn't get your test data to fail but I the other bug reporter sent some data too and I could reproduce it with that. So try the new version and see.

stangelandcl commented 12 years ago

Found another location for the same time of bug to happen and fixed that too.