Benchmark: TurboTranspose+iccodecs vs Quantile Compression

Quantile Compression/PCodec is claiming 35%-71% better compression than zstd.

I've integrated the rust library into TurboPFor using the ffi bindings for comparison purpose. We use the synthetic dataset provided in the Quantile Compression repository and other real data with large integers. As real data with values larger than 32bits are not common, we use 32 bits integers when possible instead of 64 bits for all files. Note that some files can be better compressed by using delta or the integrated zigzag delta in conjunction with TurboTranspose. Download icapp, test with your own data and convince yourself.

32 bits integers: Better compression and several times faster decompression with TurboTranspose+zstd

               
icapp i64*.txt -Ftu -e81 -Ezstd,22 
  size   ratio     E MB/s   D MB/s  function integer size=32 bits (lz=zstd,22)
450889  11.27%         19     6070  81:Lztp   Byte      Transpose  +zstd,22         i64_cents.txt
   182   0.0046%      239    28485  81:Lztp   Byte      Transpose  +zstd,22         i64_constant.txt
631400  15.79%          5     5837  81:Lztp   Byte      Transpose  +zstd,22         i64_dollars.txt
2693750  67.34%         12     4575  81:Lztp   Byte      Transpose  +zstd,22         i64_geo1M.txt
251570   6.29%         14     7335  81:Lztp   Byte      Transpose  +zstd,22         i64_geo2.txt
1028913  25.72%         12     2728  81:Lztp   Byte      Transpose  +zstd,22         i64_interleaved.txt
1640375  41.01%          4     2774  81:Lztp   Byte      Transpose  +zstd,22         i64_lomax15.txt
1592074  39.80%         12     3271  81:Lztp   Byte      Transpose  +zstd,22         i64_lomax25.txt
2006291  50.16%          5      983  81:Lztp   Byte      Transpose  +zstd,22         i64_misordered.txt
419053  10.48%          6     3915  81:Lztp   Byte      Transpose  +zstd,22         i64_normal1.txt
815743  20.39%          8     3719  81:Lztp   Byte      Transpose  +zstd,22         i64_normal10.txt
2898888  72.47%          6     3349  81:Lztp   Byte      Transpose  +zstd,22         i64_normal1M.txt
404996  10.12%          5     3687  81:Lztp   Byte      Transpose  +zstd,22         i64_slow_cosine.txt
 16027   0.40%          8    20169  81:Lztp   Byte      Transpose  +zstd,22         i64_sparse.txt
1411267  35.28%          5     2033  81:Lztp   Byte      Transpose  +zstd,22         i64_total_cents.txt
16261417  Total

icapp i64*.txt -Ftu -e173 size ratio E MB/s D MB/s function integer size=32 bits 450451 11.26% 189 431 173:qcomp quantile compress i64_cents.txt 44 0.0011% 744 549 173:qcomp quantile compress i64_constant.txt 620064 15.50% 159 448 173:qcomp quantile compress i64_dollars.txt 2676957 66.92% 76 324 173:qcomp quantile compress i64_geo1M.txt 250467 6.26% 212 571 173:qcomp quantile compress i64_geo2.txt 2253101 56.33% 92 400 173:qcomp quantile compress i64_interleaved.txt 1575073 39.38% 98 373 173:qcomp quantile compress i64_lomax15.txt 1545171 38.63% 102 398 173:qcomp quantile compress i64_lomax25.txt 2253103 56.33% 78 313 173:qcomp quantile compress i64_misordered.txt 282581 7.06% 233 451 173:qcomp quantile compress i64_normal1.txt 676116 16.90% 161 452 173:qcomp quantile compress i64_normal10.txt 2754383 68.86% 74 336 173:qcomp quantile compress i64_normal1M.txt 221218 5.53% 269 534 173:qcomp quantile compress i64_slow_cosine.txt 14323 0.36% 718 2614 173:qcomp quantile compress i64_sparse.txt 1158386 28.96% 95 229 173:qcomp quantile compress i64_total_cents.txt 16731437 Total

Floating point (64 bits): Quantile Compresion/PCodec is slightly better but decompression is a lot slower (2-3x) than zstd

icapp f64*.txt -Ftd -e80 -Ezstd,22  
  size   ratio     E MB/s   D MB/s  function floating point size=64 bits (lz=zstd,22) unsorted -1
2412121  30.15%          3     1621  80:Lz               zstd,22                     f64_decimal_long.txt
  9111  37.96%          7      913  80:Lz               zstd,22                     f64_decimal_short.txt
4970116  62.13%          5     1377  80:Lz               zstd,22                     f64_edge_cases.txt
4247812  53.10%          3      729  80:Lz               zstd,22                     f64_integers.txt
7670370  95.88%          8     1348  80:Lz               zstd,22                     f64_normal_at_0.txt
6221137  77.76%          3     1212  80:Lz               zstd,22                     f64_normal_at_1000.txt
4073918  50.92%          6      992  80:Lz               zstd,22                     f64_slow_cosine.txt
29604584 Total

icapp f64*.txt -Ftd -e173 size ratio E MB/s D MB/s function floating point size=64 bits 6686504 83.58% 189 605 173:qcomp quantile compress f64_decimal_long.txt 20134 83.89% 8 652 173:qcomp quantile compress f64_decimal_short.txt 4364570 54.56% 172 540 173:qcomp quantile compress f64_edge_cases.txt 3754251 46.93% 133 675 173:qcomp quantile compress f64_integers.txt 6943689 86.80% 131 518 173:qcomp quantile compress f64_normal_at_0.txt 5638910 70.49% 138 551 173:qcomp quantile compress f64_normal_at_1000.txt 1813077 22.66% 155 493 173:qcomp quantile compress f64_slow_cosine.txt 29221134 Total

Timestamps (64 bits) Quantile Compresion is slightly better but decompression is a lot slower (6x) than TurboTranspose+zstd

icapp micro*.* -FtT -e173
  size   ratio     E MB/s   D MB/s  function integer size=64 bits
2497182  31.21%        140      640  173:qcomp            quantile compress           micros_millis.txt.ts
3742368  46.78%        195      793  173:qcomp            quantile compress           micros_near_linear.txt.ts
6239549

icapp micro. -FtT -e81 -Ezstd,22 size ratio E MB/s D MB/s function integer size=64 bits 3385201 42.32% 16 4089 81:Lztp Byte Transpose +zstd,22 micros_millis.txt.ts 2800155 35.00% 21 3367 81:Lztp Byte Transpose +zstd,22 micros_near_linear.txt.ts 6185355 Total

Non synthetic dataset + lz77 offsets output. test1_demo (text) + test3_demo(binary). These are typical data for mixed small, medium and large integers. As iccodec we use "zstd,15" and TurboVLC+"turborc,56" (only entropy coding w/ adaptive Asymmetric Numeral System) Quantile compression is not competitive and the decompression is several (7 - 60) times slower. TurboVLC+rANS compress better and compress/decompress faster.

icapp -Ezstd,15  CCNEWS-RLZ-D64-FLENS.txt -Ftu -e81,96,80,173,3
  size   ratio     E MB/s   D MB/s   function integer size=32 bits
22145289   5.54%         29     3525    81:Lztp   Byte      Transpose  +zstd,15         
23693811   5.92%         32     2743    96:vlccomp          TurboVLC  +zstd,15          
29382157   7.35%          9     3536    80:Lz               zstd,15                     
59957497  14.99%        367      692    96:vlccomp          TurboVLC  +turborc,56 (=rANS)
62529619  15.63%        164      345   173:qcomp            quantile compress          
77585707  19.40%       1820    11324     3:p4nenc256v32     TurboPFor256

icapp -Ezstd,15 CCNEWS-RLZ-D64-FOFFSETS.txt -Ftu -e81,96,80,173,3 93751603 23.44% 19 2745 80:Lz zstd,15
283069853 70.77% 56 2622 96:vlccomp TurboVLC +zstd,15
322425616 80.61% 338 651 96:vlccomp TurboVLC +turborc,56 (=rANS) 323345103 80.84% 73 219 173:qcomp quantile compress
325331435 81.33% 2444 10740 3:p4nenc256v32 TurboPFor256

icapp -Ezstd,15 news-docs.2016-WORD.txt -Ftu -e81,96,80,173,3 142677882 35.67% 4 1444 80:Lz zstd,15
145450083 36.36% 37 1546 96:vlccomp TurboVLC +zstd,15
148119568 37.03% 11 1550 81:Lztp Byte Transpose +zstd,15
151616778 37.90% 313 605 96:vlccomp TurboVLC +turborc,56 189513565 47.38% 82 212 173:qcomp quantile compress
181946393 45.49% 1580 7641 3:p4nenc256v32 TurboPFor256

icapp -Ezstd,15 news-docs.2016-WORD-BWTMTF.txt -Ftu -e81,96,80,173,3 103706209 25.93% 306 558 96:vlccomp TurboVLC +turborc,56 105855336 26.46% 127 303 173:qcomp quantile compress
105872416 26.47% 29 1251 96:vlccomp TurboVLC +zstd,15
116101605 29.03% 11 1745 81:Lztp Byte Transpose +zstd,15
136893715 34.22% 4 1319 80:Lz zstd,15
135115053 33.78% 1561 9292 3:p4nenc256v32 TurboPFor256

icapp -Ezstd,15 test1_demo_o.u32 -e81,96,80,173,3 71858387 65.93% 332 650 96:vlccomp TurboVLC +turborc,56 72044472 66.10% 214 2036 96:vlccomp TurboVLC +zstd,15
72142814 66.19% 74 242 173:qcomp quantile compress
77852927 71.43% 7 1324 81:Lztp Byte Transpose +zstd,15
78282925 71.82% 1364 8745 3:p4nenc256v32 TurboPFor256
84333237 77.37% 6 1007 80:Lz zstd,15

icapp -Ezstd,15 test3_demo_o.u32 -e81,96,80,173,3 15946736 34.18% 11 13588 81:Lztp Byte Transpose +zstd,15
16182167 34.68% 8 1293 80:Lz zstd,15
17707120 37.95% 41 1807 96:vlccomp TurboVLC +zstd,15
17734852 38.01% 321 637 96:vlccomp TurboVLC +turborc,56 20344975 43.60% 97 226 173:qcomp quantile compress
22870847 49.01% 1500 8905 3:p4nenc256v32 TurboPFor256

powturbo / TurboPFor-Integer-Compression

Benchmark: TurboTranspose+iccodecs vs Quantile Compression #100