Open alamb opened 2 weeks ago
Looks like #12168 also might be a starting point to solve this problem. I also think about solution 2 when handling that PR, and IMO solution 2 is the better option. I will take this one.
File an upstream ticket in arrow-rs for supporting string view with the regexp_like kernels and leave a link to that ticket in the datafusion code
take
I also think about solution 2 when handling that PR, and IMO solution 2 is the better option.
I agree solution 2 is better -- it just will take longer as it needs two coordinated PRs. One thing we have done in the past is do the initial implementation in DataFusion, and then file a ticket / port the code upstream to arrow-rs. Once the code is release in arrow-rs and available in Datafusion we remove the copy in DataFusion
Is your feature request related to a problem or challenge?
Part of https://github.com/apache/datafusion/issues/11752
As we work to complete StringView support in DataFusion @2010YOUY01 noticed on https://github.com/apache/datafusion/issues/11752#issuecomment-2308176932 that we don't currently support Regexp like binary operators https://datafusion.apache.org/user-guide/sql/operators.html#op-re-match for string view
Reproducer
Describe the solution you'd like
StringView should be supported for these operators (aka the query should run without error)
Describe alternatives you've considered
Here are the relevant operator names:
Here is the dispatch code:
https://github.com/apache/datafusion/blob/0f96af5b500efff72314f840a59a736787cc3def/datafusion/physical-expr/src/expressions/binary.rs#L621-L632
It appears that the corresponding arrow-rs kernel does not yet have support for StringView https://docs.rs/arrow-string/52.2.0/src/arrow_string/regexp.rs.html#307-311
So what I would suggest is:
Additional context
No response