xiaolu / lz4

Automatically exported from code.google.com/p/lz4
1 stars 1 forks source link

LZ4_uncompress output differ on big data? #23

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. when extracting compressed data
2.
3.

What is the expected output? What do you see instead?
LZ4_uncompress output differ

What version of the product are you using? On what operating system?
linuxmint maya x64, kernel 3.4.4

Please provide any additional information below.
lz4 rev68, compressed with lz4hc, some bytes differ when decompressing
compressed size 638MB, uncompressed size around 1.8GB

different 1

extracted_file1:
03EE DCE0: 32 42 39 44 41 38 44 38  39 45 44 39 30 39 34 35  2B9DA8D8 9ED90945 
extracted_file2: 
03EE DCE0: 32 42 31 44 41 38 44 38  39 45 44 39 30 39 34 35  2B1DA8D8 9ED90945  

different 2

extracted_file1:
3D36 DF20: CD C6 B2 FF CE C8 BB FF  D0 CA BE FF D2 CC C1 FF  ........ ........  
extracted_file2:
3D36 DF20: CD C6 BA FF CE C8 BB FF  D0 CA BE FF D2 CC C1 FF  ........ ........  

different 3

extracted_file1:
3EA7 6F20: A1 D7 17 BC 06 6A 9C 11  2C 4F 31 D6 16 64 74 B4  .....j.. ,O1..dt.  
extracted_file2:
3EA7 6F20: A1 D7 13 BC 06 6A 9C 11  2C 4F 31 D6 16 64 74 B4  .....j.. ,O1..dt.  

different 4

extracted_file1:
528F 9F60: 83 78 10 00 74 12 8B 45  FC 8B 40 18 8B 4D FC 8B  .x..t..E ..@..M..  
extracted_file2:
528F 9F60: 83 78 18 00 74 12 8B 45  FC 8B 40 18 8B 4D FC 8B  .x..t..E ..@..M..  

some different more, and they looks like randomly, but no idea why.

regards,
Mark

Original issue reported on code.google.com by stackexc...@gmail.com on 23 Jun 2012 at 2:17

GoogleCodeExporter commented 9 years ago
I now noticed that, If I do memset with 0 for dest buffer, this is not 
happening.
Memset is a must? or anyway to skip that to get better speed?

Original comment by stackexc...@gmail.com on 23 Jun 2012 at 3:53

GoogleCodeExporter commented 9 years ago
Data corruption is always considered critical. I'll look into it.

Original comment by yann.col...@gmail.com on 23 Jun 2012 at 10:06

GoogleCodeExporter commented 9 years ago
In your examples, the corrupted byte is always the 3rd one.

Is that happening at this position in the file ? or is it just the 3rd position 
on the sample line ?

Another question : is the corruption happening only when compressing with 
LZ4HC, or does it also happen on the same file when compressing with LZ4 ?

Last point : is it possible to get access to the file, or the portion of file 
where the problem happens ? This would help to reproduce the issue.

Rgds

Original comment by yann.col...@gmail.com on 23 Jun 2012 at 10:11

GoogleCodeExporter commented 9 years ago
When I tried yesterday, the position was always third one from the offset of 
that line. (offset is the 4 hex number before colon ":" on that line)

But looks like that is not always third one, when I tried today also appearing 
on 7th postion of that line.

--- lz4hc compressed ---

15C5 99D0: 4D F6 C1 10 74 07 38 3C  00 00 C0 EB 46 F6 C1 84  M...t.8< ....F...  
15C5 99D0: 4D F6 C1 10 74 07 B8 3C  00 00 C0 EB 46 F6 C1 84  M...t..< ....F...  

1747 FAD0: 74 0D 8B 07 6A 01 8B CF  FF 10 EB 03 88 5D FF 8D  t...j... .....]..  
1747 FAD0: 74 0D 8B 07 6A 01 0B CF  FF 10 EB 03 88 5D FF 8D  t...j... .....]..  

3CBF 3D20: 07 50 03 C1 04 51 81 C2  A8 00 00 00 52 E8 6E C3  .P...Q.. ....R.n.  
3CBF 3D20: 07 50 83 C1 04 51 81 C2  A8 00 00 00 52 E8 6E C3  .P...Q.. ....R.n.  

412F 3C20: 00 00 8B C3 5B 59 5D C3  55 8B EC 81 C4 F0 FB FF  ....[Y]. U.......  
412F 3C20: 00 00 0B C3 5B 59 5D C3  55 8B EC 81 C4 F0 FB FF  ....[Y]. U.......  

4C48 5E60: 5F 5F 71 29 0A 20 20 20  20 7B 20 72 65 74 75 72  __q).     { retur  
4C48 5E60: 5F 5F 79 29 0A 20 20 20  20 7B 20 72 65 74 75 72  __y).     { retur  

4ED6 EEE0: 8D EB 31 FF 85 C0 0F 85  1C 02 00 00 FF 75 3C E8  ..1..... .....u<.  
4ED6 EEE0: 8D EB B1 FF 85 C0 0F 85  1C 02 00 00 FF 75 3C E8  ........ .....u<.  

--- lz4 compressed (1st try) ---

0841 58D0: EF 45 35 1C 6F 28 40 82  EF 00 ED E7 B1 7E 4B 83  .E5.o(@. .....~K.  
0841 58D0: EF 45 35 1C 6F 28 48 82  EF 00 ED E7 B1 7E 4B 83  .E5.o(H. .....~K.  

529F 9B10: E3 36 EF 36 0B 37 14 37  6B 37 ED 37 2D 38 8B 38  .6.6.7.7 k7.7-8.8  
529F 9B10: E3 36 EF 36 0B 37 10 37  6B 37 ED 37 2D 38 8B 38  .6.6.7.7 k7.7-8.8  

--- lz4 compressed (2nd try) ---

1447 99D0: 10 18 D7 82 C4 36 9B 5B  49 BB C1 9F 6F 23 68 28  .....6.[ I...o#h(  
1447 99D0: 10 18 D7 82 C4 36 13 5B  49 BB C1 9F 6F 23 68 28  .....6.[ I...o#h(  

28BF BB90: 08 81 09 84 00 0A 07 84  84 84 87 87 87 86 86 86  ........ ........  
28BF BB90: 08 81 09 84 00 0A 87 84  84 84 87 87 87 86 86 86  ........ ........  

41ED FE20: F9 F9 F2 FD FE FE FA FA  FD FA FA FA FA FA FA FA  ........ ........  
41ED FE20: F9 F9 FA FD FE FE FA FA  FD FA FA FA FA FA FA FA  ........ ........  

I have deleted the original file because so I extracted with memset 0 and 
recompressed with lz4, tested above two. But still happening when extracting 
without memset.

The file is the virtualbox vdi file, includes running windows XP, some 
softwares inside. I think I am not supposed to post downloadable link online. 
But let me know if you really need it. I could upload and send the link to your 
private email address. Is the email you registered here is private one?

Regards,

Original comment by stackexc...@gmail.com on 24 Jun 2012 at 9:36

GoogleCodeExporter commented 9 years ago
OK, thanks for the traces.
If i do understand correctly :

- Issue happens on the decoder side, whatever the compression algorithm (lz4 - 
lz4hc)

- Issue is random : you seem to have different errors at different positions 
between each try.

- Corrupted bytes are isolated , there is no long trails of errors. This part 
is really uncommon. I see no valid reason for this to happen with the LZ4 
algorithm : it should quickly propagate the error further into the file.

- Even more precisely, the error seems to concern one bit. Listing your 
examples :
38 -> B8 (bit 7)
8B -> 0B (bit 7)
03 -> 83 (bit 7)
and so on.
It means the corrupted byte does not seem completely random.
This is unexpected : LZ4 is byte-oriented, not bit-oriented. And there is no 
masking anywhere during decoding. There is no reason it would wrongly output 
just one bit : the wrong byte should be completely random.

OK, so since memset() seems to correct the issue, i'm starting to wonder if it 
could br related to allocation mechanism : since with memset(), memory is 
really allocated beforehand, while without it, memory is merely "reserved", to 
be allocated "just in time" later on. Maybe at this stage, something may not 
happen properly. Maybe a race condition ?

Last point : is this issue reproducible on another machine ? Could it be 
hardware-related ?

Regarding the sample file : yes, having a sample which allows to reproduce the 
issue would help a lot to understand it. You don't need the full file, just a 
small sample of it which triggers the issue. 
But, if the problem is random, this might be difficult...

Original comment by yann.col...@gmail.com on 25 Jun 2012 at 3:57

GoogleCodeExporter commented 9 years ago
>> This is unexpected : LZ4 is byte-oriented, not bit-oriented.
Thanks, I think that answered a lot. Need to doubt my machine now.

I was originally thought some masking happened with dest buffer. But since lz4 
is not using masking during decoding this shouldn't happened, and actually I 
tried to dump memory before decompress without memset yesterday, and I noticed 
the whole memory was already all-zeros.

>> it could br related to allocation mechanism : since with memset(), memory is 
really allocated beforehand, while without it, memory is merely "reserved", to 
be allocated "just in time" later on
I see, that make sense.

>> Maybe a race condition ?
>> Could it be hardware-related ?
Could be, and may be memory related issue, and I am extracting the file into 
dynamically expanded ramdisk actually. I thought that shouldn't be problem, so 
I was not mentioning it.

>> is this issue reproducible on another machine ?
I don't have another machine with enough memory to test same way at the moment.

I will try to test some more, and I will report the progress. Probably you may 
close the issue once confirmed.

Rgds,

Original comment by stackexc...@gmail.com on 26 Jun 2012 at 1:05

GoogleCodeExporter commented 9 years ago
You're welcomed.
Please keep us posted about your progresses.

Rgds

Original comment by yann.col...@gmail.com on 26 Jun 2012 at 7:38

GoogleCodeExporter commented 9 years ago
I tested without using ramdisk and but unfortunately results were same, and its 
just random. Also tested on another kernel on same machine, but same results.
So, may be this is just my machine problem.

Rgds,

Original comment by stackexc...@gmail.com on 27 Jun 2012 at 12:14

GoogleCodeExporter commented 9 years ago
Maybe, possibly.
At least, it's not ramdisk nor kernel related.
I would feel better if the problem could prove not being repeatable on another 
machine.

Rgds

Original comment by yann.col...@gmail.com on 27 Jun 2012 at 2:24

GoogleCodeExporter commented 9 years ago
Closed.
The issue seems related to Hardware problems.
Don't hesitate to open it again if the above hypothesis prove wrong.

Original comment by yann.col...@gmail.com on 10 Jul 2012 at 10:55