Cyan4973 / FiniteStateEntropy

New generation entropy codecs : Finite State Entropy and Huff0
BSD 2-Clause "Simplified" License
1.33k stars 143 forks source link

benchmark mode claims a decompression error on this data w/huf #102

Open adamdmoss opened 4 years ago

adamdmoss commented 4 years ago
 % ./fse -h -b ./xxx    
FSE : Finite State Entropy, 64-bits demo by Yann Collet (Jul 17 2020)
!! Error decompressing block 4 of cSize 18041 !! => (Corrupted block detected)

gunzip the below file and run the above. xxx.gz The 'xxx' file appears to survive a huf compress and then a huf decompress intact when doing them individually, so perhaps this is an issue specific to the benchmark mode.

Tested with 3865a704e8079fb45fc921480a6c351bb6bfbd72

adamdmoss commented 4 years ago
 % cc --version
cc (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
adamdmoss commented 4 years ago

also occurs with:

 % clang --version
clang version 9.0.0-2~ubuntu18.04.2 (tags/RELEASE_900/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

... so I suppose, not a compiler issue.

adamdmoss commented 4 years ago

Oh, yikes, might be a dupe of https://github.com/Cyan4973/FiniteStateEntropy/issues/95 ... except this repro data is tiny. :)

Cyan4973 commented 4 years ago

I think the issue is that the xxx file is > 128 KB.

The huffman format requires input data to be provided in blocks <= 128 KB. When it's longer than that, it must be split accordingly. Splitting is performed by the I/O layer, but is not present in the benchmark module.

Nevertheless, while there is a explanation, it underlines that the information message is not clear enough about the root cause.

adamdmoss commented 4 years ago

Ah! That was really unobvious. Thanks for the explanation. :)