kapenga / LittleBit

LittleBit is a pure Huffman coding compression algorithm with the option of random access reading while offering competitive compression ratios.
12 stars 1 forks source link

suggestion for decoder class #4

Closed da0ka closed 3 years ago

da0ka commented 3 years ago

I think that hasEndOfLine may not needed for decompression. if node.data.length==0 or node.data==null then break loop.

Decoder.java: line 73 nodes[i][x].data = new byte[0]; // or nodes[i][x].data = null

kapenga commented 3 years ago

This is a bit more complicated. When you encode a single file there is only 1 EOF or EOL (end of line). This will be always encoded separately because there is a minimum count that is larger than 1. In that case you are correct.

But in case of many EOL's (fields from a database for example) the problem arises that the encoder starts to incorporate the EOL's in merged nodes. Those merged nodes will have data and the EOL on the end. Therefor we need to keep track of the EOL's in higher level nodes.

It's a little layer of added complexity to the decoder, but it does not hit compression ratio's or decoding speed.

kapenga commented 3 years ago

I am closing this issue because there was a good reason for this design and the topic starter did not provide reasons to think otherwise.