I've integrated the rust library into TurboPFor using the ffi bindings for comparison purpose.
We use the synthetic dataset provided in the Quantile Compression repository and other real data with large integers.
As real data with values larger than 32bits are not common, we use 32 bits integers when possible instead of 64 bits for all files. Note that some files can be better compressed by using delta or the integrated zigzag delta in conjunction with TurboTranspose. Download icapp, test with your own data and convince yourself.
32 bits integers:
Better compression and several times faster decompression with TurboTranspose+zstd
Timestamps (64 bits)
Quantile Compresion is slightly better but decompression is a lot slower (6x) than TurboTranspose+zstd
icapp micro*.* -FtT -e173
size ratio E MB/s D MB/s function integer size=64 bits
2497182 31.21% 140 640 173:qcomp quantile compress micros_millis.txt.ts
3742368 46.78% 195 793 173:qcomp quantile compress micros_near_linear.txt.ts
6239549
icapp micro. -FtT -e81 -Ezstd,22
size ratio E MB/s D MB/s function integer size=64 bits
3385201 42.32% 16 4089 81:Lztp Byte Transpose +zstd,22 micros_millis.txt.ts
2800155 35.00% 21 3367 81:Lztp Byte Transpose +zstd,22 micros_near_linear.txt.ts
6185355 Total
Non synthetic dataset + lz77 offsets output. test1_demo (text) + test3_demo(binary). These are typical data for mixed small, medium and large integers.
As iccodec we use "zstd,15" and TurboVLC+"turborc,56" (only entropy coding w/ adaptive Asymmetric Numeral System)
Quantile compression is not competitive and the decompression is several (7 - 60) times slower.
TurboVLC+rANS compress better and compress/decompress faster.
Quantile Compression/PCodec is claiming 35%-71% better compression than zstd.
I've integrated the rust library into TurboPFor using the ffi bindings for comparison purpose. We use the synthetic dataset provided in the Quantile Compression repository and other real data with large integers. As real data with values larger than 32bits are not common, we use 32 bits integers when possible instead of 64 bits for all files. Note that some files can be better compressed by using delta or the integrated zigzag delta in conjunction with TurboTranspose. Download icapp, test with your own data and convince yourself.
icapp i64*.txt -Ftu -e173 size ratio E MB/s D MB/s function integer size=32 bits 450451 11.26% 189 431 173:qcomp quantile compress i64_cents.txt 44 0.0011% 744 549 173:qcomp quantile compress i64_constant.txt 620064 15.50% 159 448 173:qcomp quantile compress i64_dollars.txt 2676957 66.92% 76 324 173:qcomp quantile compress i64_geo1M.txt 250467 6.26% 212 571 173:qcomp quantile compress i64_geo2.txt 2253101 56.33% 92 400 173:qcomp quantile compress i64_interleaved.txt 1575073 39.38% 98 373 173:qcomp quantile compress i64_lomax15.txt 1545171 38.63% 102 398 173:qcomp quantile compress i64_lomax25.txt 2253103 56.33% 78 313 173:qcomp quantile compress i64_misordered.txt 282581 7.06% 233 451 173:qcomp quantile compress i64_normal1.txt 676116 16.90% 161 452 173:qcomp quantile compress i64_normal10.txt 2754383 68.86% 74 336 173:qcomp quantile compress i64_normal1M.txt 221218 5.53% 269 534 173:qcomp quantile compress i64_slow_cosine.txt 14323 0.36% 718 2614 173:qcomp quantile compress i64_sparse.txt 1158386 28.96% 95 229 173:qcomp quantile compress i64_total_cents.txt 16731437 Total
icapp f64*.txt -Ftd -e173 size ratio E MB/s D MB/s function floating point size=64 bits 6686504 83.58% 189 605 173:qcomp quantile compress f64_decimal_long.txt 20134 83.89% 8 652 173:qcomp quantile compress f64_decimal_short.txt 4364570 54.56% 172 540 173:qcomp quantile compress f64_edge_cases.txt 3754251 46.93% 133 675 173:qcomp quantile compress f64_integers.txt 6943689 86.80% 131 518 173:qcomp quantile compress f64_normal_at_0.txt 5638910 70.49% 138 551 173:qcomp quantile compress f64_normal_at_1000.txt 1813077 22.66% 155 493 173:qcomp quantile compress f64_slow_cosine.txt 29221134 Total
icapp micro. -FtT -e81 -Ezstd,22 size ratio E MB/s D MB/s function integer size=64 bits 3385201 42.32% 16 4089 81:Lztp Byte Transpose +zstd,22 micros_millis.txt.ts 2800155 35.00% 21 3367 81:Lztp Byte Transpose +zstd,22 micros_near_linear.txt.ts 6185355 Total
icapp -Ezstd,15 CCNEWS-RLZ-D64-FOFFSETS.txt -Ftu -e81,96,80,173,3 93751603 23.44% 19 2745 80:Lz zstd,15
283069853 70.77% 56 2622 96:vlccomp TurboVLC +zstd,15
322425616 80.61% 338 651 96:vlccomp TurboVLC +turborc,56 (=rANS) 323345103 80.84% 73 219 173:qcomp quantile compress
325331435 81.33% 2444 10740 3:p4nenc256v32 TurboPFor256
icapp -Ezstd,15 news-docs.2016-WORD.txt -Ftu -e81,96,80,173,3 142677882 35.67% 4 1444 80:Lz zstd,15
145450083 36.36% 37 1546 96:vlccomp TurboVLC +zstd,15
148119568 37.03% 11 1550 81:Lztp Byte Transpose +zstd,15
151616778 37.90% 313 605 96:vlccomp TurboVLC +turborc,56 189513565 47.38% 82 212 173:qcomp quantile compress
181946393 45.49% 1580 7641 3:p4nenc256v32 TurboPFor256
icapp -Ezstd,15 news-docs.2016-WORD-BWTMTF.txt -Ftu -e81,96,80,173,3 103706209 25.93% 306 558 96:vlccomp TurboVLC +turborc,56 105855336 26.46% 127 303 173:qcomp quantile compress
105872416 26.47% 29 1251 96:vlccomp TurboVLC +zstd,15
116101605 29.03% 11 1745 81:Lztp Byte Transpose +zstd,15
136893715 34.22% 4 1319 80:Lz zstd,15
135115053 33.78% 1561 9292 3:p4nenc256v32 TurboPFor256
icapp -Ezstd,15 test1_demo_o.u32 -e81,96,80,173,3 71858387 65.93% 332 650 96:vlccomp TurboVLC +turborc,56 72044472 66.10% 214 2036 96:vlccomp TurboVLC +zstd,15
72142814 66.19% 74 242 173:qcomp quantile compress
77852927 71.43% 7 1324 81:Lztp Byte Transpose +zstd,15
78282925 71.82% 1364 8745 3:p4nenc256v32 TurboPFor256
84333237 77.37% 6 1007 80:Lz zstd,15
icapp -Ezstd,15 test3_demo_o.u32 -e81,96,80,173,3 15946736 34.18% 11 13588 81:Lztp Byte Transpose +zstd,15
16182167 34.68% 8 1293 80:Lz zstd,15
17707120 37.95% 41 1807 96:vlccomp TurboVLC +zstd,15
17734852 38.01% 321 637 96:vlccomp TurboVLC +turborc,56 20344975 43.60% 97 226 173:qcomp quantile compress
22870847 49.01% 1500 8905 3:p4nenc256v32 TurboPFor256