apache / iceberg-rust

Apache Iceberg
https://rust.iceberg.apache.org/
Apache License 2.0
477 stars 97 forks source link

Modify `Bind` calls so that they don't consume `self` and instead return a new struct, leaving the original unmoved #290

Closed sdd closed 3 months ago

sdd commented 3 months ago

This is a pre-requisite to https://github.com/apache/iceberg-rust/pull/241 and was a part of that PR but has been pulled into it's own PR after discussions with @liurenjie1024.

The existing Predicate Bind implementation consumes self when bind() is called. This was done for performance reasons but has the consequence that it is not possible to create a BoundPredicate whilst keeping the original Predicate unconsumed, which we need to do in the table scan file planner when a filter predicate is specified.

The consequences of this change are that the literals within the expressions in the predicate are cloned in order to create a BoundPredicate. Most of the constituent parts of a Predicate were cloned or newly-created anyway even in the original implementation. The alternative would be that Clone would need to be implemented for Predicate and the whole Predicate would need to be cloned before calling bind, which would be more expensive. Alternatively we could have two bind methods, one consuming self and the other not doing so, which would be a bit ugly.

liurenjie1024 commented 3 months ago

Let's wait for a moment to see if others have comments.

cc @Xuanwo @ZENOTME @Fokko PTAL