Open allisonport-db opened 10 months ago
@allisonport-db @vkorukanti I would like to work on this.
@krishnanravi Sounds good. I assigned the issue to you. Thank you!
Here is an example PR that adds the IS_NULL
expression. STARTS_WITH
also requires similar changes. Feel free to ping us for any questions.
@vkorukanti qq - what data types do we want to support for starts with
in kernel defaults?
I just finished an implementation that enforces left side data type and right are always string which otherwise would raise an unsupported operation exception. Is this enough for the kernel defaults?
in contrast, looking at support for starts with
in spark, any expression of any data type is supported on both sides. do we want an implementation as comprehensive as spark?
The LIKE
expression is available. We can write the STARTS_WITH
as LIKE 'str%
and add the remaining data skipping part of the work.
Feature request
Which Delta project/connector is this regarding?
Overview
Currently Kernel supports a limited set of expressions. We should 1) add the STARTS_WITH expression and 2) use file statistics to prune files based on the expression.
Motivation
Better file pruning.
Further details
This means we should 1) add STARTS_WITH to the Kernel Predicate and support it in the kernel-defaults project 2) Generate a data skipping filter according to the same rules we use in delta-spark
Willingness to contribute
The Delta Lake Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?