sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Use compression level 1 for Brotli and update crate version #109

Closed sunchao closed 6 years ago

sunchao commented 6 years ago

This tries to improve the Brotli compression performance by doing the following:

  1. upgrade the crate version of Brotli from 1.1.2 to 2.0.1.
  2. Change the "compression quality" level from 9 (which is the highest) to 1.
  3. change compress to use CompressorWriter instead of CompressReader.

Bench result before:

test compress_brotli_binary    ... bench:  61,637,026 ns/iter (+/- 16,641,301) = 5 MB/s
test compress_brotli_boolean   ... bench:   1,563,393 ns/iter (+/- 693,189)
test compress_brotli_double    ... bench:  16,397,775 ns/iter (+/- 3,019,857) = 4 MB/s
test compress_brotli_fixed     ... bench:     326,538 ns/iter (+/- 143,024) = 40 MB/s
test compress_brotli_float     ... bench:  14,645,035 ns/iter (+/- 3,252,120) = 2 MB/s
test compress_brotli_int32     ... bench:  20,020,357 ns/iter (+/- 4,143,552) = 2 MB/s
test compress_brotli_int64     ... bench:  20,739,429 ns/iter (+/- 4,357,913) = 3 MB/s
test compress_brotli_int96     ... bench:      44,012 ns/iter (+/- 5,365)
test decompress_brotli_binary  ... bench:   3,086,329 ns/iter (+/- 203,754) = 215 MB/s
test decompress_brotli_boolean ... bench:       6,985 ns/iter (+/- 1,180) = 95040 MB/s
test decompress_brotli_double  ... bench:     875,386 ns/iter (+/- 41,056) = 758 MB/s
test decompress_brotli_fixed   ... bench:      25,462 ns/iter (+/- 5,123) = 26072 MB/s
test decompress_brotli_float   ... bench:     462,026 ns/iter (+/- 63,771) = 1436 MB/s
test decompress_brotli_int32   ... bench:     332,778 ns/iter (+/- 47,271) = 1994 MB/s
test decompress_brotli_int64   ... bench:     129,508 ns/iter (+/- 14,143) = 5126 MB/s
test decompress_brotli_int96   ... bench:       9,557 ns/iter (+/- 774) = 69463 MB/s

After:

test compress_brotli_binary    ... bench:   3,923,306 ns/iter (+/- 2,145,425) = 93 MB/s
test compress_brotli_boolean   ... bench:      25,157 ns/iter (+/- 9,708) = 51 MB/s
test compress_brotli_double    ... bench:     462,516 ns/iter (+/- 52,214) = 172 MB/s
test compress_brotli_fixed     ... bench:      13,354 ns/iter (+/- 2,434) = 1001 MB/s
test compress_brotli_float     ... bench:     261,208 ns/iter (+/- 67,905) = 153 MB/s
test compress_brotli_int32     ... bench:     275,277 ns/iter (+/- 36,415) = 148 MB/s
test compress_brotli_int64     ... bench:     168,823 ns/iter (+/- 31,994) = 481 MB/s
test compress_brotli_int96     ... bench:       4,535 ns/iter (+/- 740) = 3 MB/s
test decompress_brotli_binary  ... bench:   3,282,566 ns/iter (+/- 733,192) = 202 MB/s
test decompress_brotli_boolean ... bench:       6,930 ns/iter (+/- 587) = 95795 MB/s
test decompress_brotli_double  ... bench:     950,969 ns/iter (+/- 51,928) = 698 MB/s
test decompress_brotli_fixed   ... bench:      38,993 ns/iter (+/- 8,919) = 17025 MB/s
test decompress_brotli_float   ... bench:     539,966 ns/iter (+/- 204,754) = 1229 MB/s
test decompress_brotli_int32   ... bench:     367,070 ns/iter (+/- 73,648) = 1808 MB/s
test decompress_brotli_int64   ... bench:     146,930 ns/iter (+/- 35,438) = 4518 MB/s
test decompress_brotli_int96   ... bench:       5,649 ns/iter (+/- 679) = 117518 MB/s

Without the change from 3):

test compress_brotli_binary    ... bench:   2,777,882 ns/iter (+/- 1,247,970) = 131 MB/s
test compress_brotli_boolean   ... bench:      29,590 ns/iter (+/- 9,017) = 43 MB/s
test compress_brotli_double    ... bench:     977,756 ns/iter (+/- 479,775) = 81 MB/s
test compress_brotli_fixed     ... bench:      26,173 ns/iter (+/- 1,832) = 511 MB/s
test compress_brotli_float     ... bench:     454,359 ns/iter (+/- 80,253) = 88 MB/s
test compress_brotli_int32     ... bench:     450,869 ns/iter (+/- 105,584) = 90 MB/s
test compress_brotli_int64     ... bench:     896,055 ns/iter (+/- 55,858) = 90 MB/s
test compress_brotli_int96     ... bench:       4,384 ns/iter (+/- 745) = 3 MB/s
test decompress_brotli_binary  ... bench:   4,187,645 ns/iter (+/- 623,909) = 158 MB/s
test decompress_brotli_boolean ... bench:       7,095 ns/iter (+/- 2,370) = 93567 MB/s
test decompress_brotli_double  ... bench:   1,094,682 ns/iter (+/- 138,317) = 606 MB/s
test decompress_brotli_fixed   ... bench:      56,858 ns/iter (+/- 11,433) = 11675 MB/s
test decompress_brotli_float   ... bench:     579,580 ns/iter (+/- 61,161) = 1145 MB/s
test decompress_brotli_int32   ... bench:      64,417 ns/iter (+/- 8,459) = 10305 MB/s
test decompress_brotli_int64   ... bench:     186,933 ns/iter (+/- 17,688) = 3551 MB/s
test decompress_brotli_int96   ... bench:       5,648 ns/iter (+/- 997) = 117538 MB/s

Fixes #105 .

coveralls commented 6 years ago

Coverage Status

Coverage decreased (-0.1%) to 94.844% when pulling b21761c361a4ceaef0a65c6fec07e1eeb2092de2 on fix-brotli into 5143db1835fd04cea659bf6fe0c814eb35efc2a8 on master.

coveralls commented 6 years ago

Coverage Status

Coverage decreased (-0.1%) to 94.844% when pulling b21761c361a4ceaef0a65c6fec07e1eeb2092de2 on fix-brotli into 5143db1835fd04cea659bf6fe0c814eb35efc2a8 on master.

sunchao commented 6 years ago

Merged. Thanks @sadikovi .

I'm not sure why the code coverage dropped as this doesn't add any new stuff.

sadikovi commented 6 years ago

Great that you fixed the issue!

All good, no worries. Coverage report can be a bit weird sometimes - I suggest we check actual tests and use coverage as a reference.