apache / iceberg-rust

Apache Iceberg
https://rust.iceberg.apache.org/
Apache License 2.0
477 stars 97 forks source link

feat: Convert predicate to arrow filter and push down to parquet reader #295

Closed viirya closed 1 month ago

viirya commented 3 months ago

This implements the feature of row filtering when reading Parquet files in Iceberg scan. It is achieved by converting predicates into Parquet Arrow filter which is used to filter rows during reading in Parquet reader.

This implements AlwaysTrue, AlwaysFalse, And, Or, Not, Binary, partial Unary predicates. Unimplemented predicates (some Unary and Set predicates) are because no existing kernels to be used in arrow. I'll implement them in following works.

close #265

liurenjie1024 commented 3 months ago

cc @viirya Is this ready for review or you still need to do more update?

viirya commented 3 months ago

@liurenjie1024 It is ready for review. I will fix the conflicts.

viirya commented 3 months ago

I've addressed some of above reviews. I will resolve other reviews soon. Thanks.

viirya commented 2 months ago

@liurenjie1024 I've addressed all comments. Thank you.

viirya commented 2 months ago

@liurenjie1024 Thanks for review. Sorry for late. I addressed the comments by rewriting the visitors using the new API. I replied with another questions.

viirya commented 1 month ago

@liurenjie1024 I've addressed your comments. Please take a look when you can. Thanks.

liurenjie1024 commented 1 month ago

Oh, sorry, seems we need to resolve conflicts. Others LGTM, thanks!

viirya commented 1 month ago

Thanks @liurenjie1024. I just resolved the conflicts.

liurenjie1024 commented 1 month ago

Thanks @viirya for this great effort!

viirya commented 1 month ago

Thanks @liurenjie1024 for your review!