sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Use [u32; 3] for INT96 and fix TODOs #92

Closed sadikovi closed 6 years ago

sadikovi commented 6 years ago

This PR updates INT96 type to use [u32; 3] instead of Vec<u32>. I also updated plain decoding for INT96 to take advantage of an array and slightly refactored the method itself.

Now INT96 looks like this:

#[derive(Clone, Debug)]
pub struct Int96 {
  value: Option<[u32; 3]>
}

I also updated set_data method to set elements directly.

All tests pass locally. I also ran benchmark assuming that my code is correct (not included in this PR, but git diff is attached):

Benchmark diff:


diff --git a/benches/common.rs b/benches/common.rs
index c8d8e0c..1e68434 100644
--- a/benches/common.rs
+++ b/benches/common.rs
@@ -39,10 +39,31 @@ macro_rules! gen_random_ints {
}
}

+macro_rules! gen_random_int96 {

+gen_random_int96!(gen_int96_1000, 1000); + pub fn gen_test_strs(total: usize) -> (usize, Vec) { let mut words = Vec::new(); words.push("aaaaaaaaaa"); diff --git a/benches/decoding.rs b/benches/decoding.rs index ec1f097..300ccdd 100644 --- a/benches/decoding.rs +++ b/benches/decoding.rs @@ -113,6 +113,8 @@ fn bench_decoding( }) }

+plain!(plain_i96_1k_32, 1024, 128, Int96Type, Type::INT96, gen_int96_1000); + plain!(plain_i32_1k_32, 1024, 32, Int32Type, Type::INT32, gen_1000); plain!(plain_i32_1k_64, 1024, 64, Int32Type, Type::INT32, gen_1000); plain!(plain_i32_1k_128, 1024, 128, Int32Type, Type::INT32, gen_1000);

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.09%) to 94.887% when pulling b9fd453cf62758906d79f6c9acc6fa27a08ad38b on sadikovi:fix-int96-todo into 15f122deea469b31923736bbba9378a4c4ed3f34 on sunchao:master.

sunchao commented 6 years ago

Thanks @sadikovi ! The performance with this PR looks awesome. Do you know why it improved so much?

sadikovi commented 6 years ago

No worries. I think it is because of the change in decoding.rs code.

We used to create a copy of buffer ptr that covers a range of 12 bytes for each value, and create a slice and then vector.

Now we create a copy of buffer ptr once with all bytes that we are going to read, and then extract slice of the data. Since we know that all bytes should match to the values we need to read, we could safely iterate with two indices - one over values and another one over byte offsets. For each iteration we just reconstruct 3 u32 from slices.

sunchao commented 6 years ago

OK got it. It a little surprising that the change in decoding can give such a big boost on the performance. This is nice.

sunchao commented 6 years ago

Merged. Thanks @sadikovi !