Open alamb opened 2 weeks ago
cc @Lordworms
Sorry for the late review since I was busy this week. In the beginning, I was just trying to keep the same format as other ScalarUDF which utilize arrow-rs methods to implement functionality so I just chose arrow::reglike. I can fix it to use str.contains
I was just trying to keep the same format as other ScalarUDF which utilize arrow-rs methods to implement functionality so I just chose arrow::reglike.
Makes sense. I think in this case, however, the function shouldn't actually have regexp support, so it would be better to use str.contains
I can fix it to use str.contains
Thank you!
Thanks @Lordworms. I'm thinking about @alamb's idea https://github.com/apache/datafusion/pull/10879#discussion_r1636789004, which only implements contains
as a placeholder for translating/planning. And it would finally become some other thing like LIKE
after the optimization phase. A similar thing is Expr::Wildcard
, it's to reflect *
from SQL but doesn't have a corresponding physical expr.
One benefit of using LIKE
is that it already has a highly optimized arrow implementation (as in will actually use substring if the patttern looks like substr%
etc).
I'll refactor it to use LIKE
_Originally posted by @waynexia in https://github.com/apache/datafusion/pull/10879#discussion_r1635767599_