sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Improve performance of record reader by using Vec::with_capacity #206

Closed sadikovi closed 5 years ago

sadikovi commented 5 years ago

This PR is a very minor update to the code that we planned to do a few months ago. We basically replace all Vec::new with Vec::with_capacity wherever it is possible. This shows a fairly small boost in performance, but it is nice to have it.

Before:

test record_reader_10k_collect             ... bench:  41,606,845 ns/iter (+/- 2,679,661) = 16 MB/s
test record_reader_stock_simulated_collect ... bench: 341,853,174 ns/iter (+/- 52,464,363) = 3 MB/s
test record_reader_stock_simulated_column  ... bench:  14,812,884 ns/iter (+/- 856,897) = 87 MB/s

After:

test record_reader_10k_collect             ... bench:  36,193,855 ns/iter (+/- 2,259,012) = 18 MB/s
test record_reader_stock_simulated_collect ... bench: 261,540,852 ns/iter (+/- 10,225,835) = 4 MB/s
test record_reader_stock_simulated_column  ... bench:  18,913,039 ns/iter (+/- 296,384) = 68 MB/s

I am not sure why record_reader_stock_simulated_column is slower, this change should not affect the benchmark at all.