This library provides high performance (GB/s) compression and decompression of integers (int32/uint32/int64/uint64).
Good compression factor can be achieved when, on average, the difference between 2 consecutive values of the input remains small and thus can be encoded with fewer bits.
Common use cases:
The encoding schemes used here are based on Dr. Daniel Lemire research.
Data is encoded in blocks of multiple of 128x32bit or 256x64bits inputs in the following manner:
The remaining input that won't fit within a 128x32bits or 256x64bits block will be encoded in an additional block using Variable Byte encoding (with delta)
In stream processing systems data is usually received by chunks. Compressing and aggregating small chunks can be inneficient and impractical.
This API provides a convenient way to handle such inputs: When adding data to a compressed buffer, if the last block is a small block, encoded with Variable Byte, it will be rewritten in order to provide better compression using bit packing.
Timestamps with nanosecond resolution sometimes have an actual lower internal resolution (eg. microsecond). To provide better compression for that type of data, the encoding algorithm for int64 has a specific optimization that will provide better compression factor in such case.
input := []int32{1, 2, 3}
// compress
compressed := intcomp.CompressInt32(input, nil)
// compress more data (append)
compressed = intcomp.CompressInt32([]int32{4, 5, 6}, compressed)
// uncompress
data := intcomp.UncompressInt32(compressed, nil)
// data: [1, 2, 3, 4, 5, 6]
Benchmarks for the bitpacking compression/decompression (MacBook pro M1). The result vary depending on the number of bits used to encode integers.