Closed alamb closed 1 week ago
I am glad to pick this ticket.
This issue must wait until #10920 because there is currently no convenient way to create a StringViewArray
in Datafusion. If I am mistaken, please correct me.
This issue must wait until #10920 because there is currently no convenient way to create a
StringViewArray
in Datafusion. If I am mistaken, please correct me.
I think you are right -- conveniently @XiangpengHao has one here https://github.com/apache/datafusion/pull/10925
Hi @Weijun-H , great to know you are working on this! I believe implementing this feature will eventually require https://github.com/apache/arrow-rs/issues/5897 to be solved, so I'm working on that issue so you won't be blocked
BTW I made a branch to work on StringView in DataFusion: https://github.com/apache/datafusion/issues/10961
StringView comparison added in https://github.com/apache/datafusion/pull/10985
Is your feature request related to a problem or challenge?
Part of https://github.com/apache/datafusion/issues/10918,
[StringViewArray
](https://docs.rs/arrow/latest/arrow/array/type.StringViewArray.html) support in DataFusionThere are several queries in the clickbench suite like follows:
where
"MobilePhoneModel"
and"SearchPhrase"
are string columns with predicates (in this case checking for empty string)Describe the solution you'd like
In order to improve performance of these queries we will need the ability to actually compare
StringViewArrays
to constant strings (and likely to each other)Thus I would like to be able to run
StringViewColumn = scalar
StringViewColumn = StringViewColumn
(and likewise for BinaryView)
I basically want to to run the following queries (where table
foo
hasStringView
columns)Describe alternatives you've considered
I suspect we will need to update the coercion logic and maybe also the arrow equality kernels like https://docs.rs/arrow/latest/arrow/compute/kernels/cmp/fn.eq.html
Additional context
No response