burmanm / compression-int

64-bit integer compression algorithms in Java
Apache License 2.0
15 stars 3 forks source link

Scalar integer compression algorithm based on the MILC #4

Open burmanm opened 5 years ago

burmanm commented 5 years ago

Taking some inspiration from MILC: Inverted List Compression in Memory, we should implement their algorithm but add some changes for Java and some improvements for compression ratio:

burmanm commented 5 years ago

As an implementation note, the optimal selection of words vs left-greedy was not beneficial in inverted indexing when it came to the Simple-family as shown in the paper "Optimal Packing in Simple Family Codecs".

In the MILC-paper they arrive with a different conclusion, so it might be worth investigating the optimal solution in page-size vs. left-greedy approach. Latter one might not provide significantly worse compression ratio, but could provide large performance improvement.