Closed alamb closed 1 month ago
See https://github.com/datafusion-contrib/datafusion-functions-json/pull/26 - support for custom SQL operators in datafusion-functions-json
using #11208.
hello @alamb Just checking in on the remaining tasks. Is there anything specific we're waiting on before we create issues ? If we're all set, i would be happy to jump in and get started to pick up few tasks.
hello @alamb Just checking in on the remaining tasks. Is there anything specific we're waiting on before we create issues ? If we're all set, i would be happy to jump in and get started to pick up few tasks.
Hi @dharanad I don't think there is anything from my perspective. Thank you for offering
In fact it seems as if @xinlifoobar has already started with https://github.com/apache/datafusion/pull/11215 ❤️
I've created issues for a couple of tasks. Please let me know if you think anything needs updating in the descriptions. I'm new here and learning from shadowing the experienced folks
I've created issues for a couple of tasks. Please let me know if you think anything needs updating in the descriptions. I'm new here and learning from shadowing the experienced folks
thank you @dharanad -- this is very helpful 🙏
FWIW in general @dharanad I have had the best luck with writing a description on tickets that requires as little context as possible (aka distill down what is needed into the the description, rather than assuming the new contributor will read the epic and get all the backstory)
The rationale for this duplication is to lower the barrier to new contrbutors
FWIW in general @dharanad I have had the best luck with writing a description on tickets that requires as little context as possible (aka distill down what is needed into the the description, rather than assuming the new contributor will read the epic and get all the backstory)
The rationale for this duplication is to lower the barrier to new contrbutors
Thanks for the feedback! I really appreciate. You're right, making the ticket description concise and self-contained will definitely help reduce the barrier for new contributors. I'll update the description to include the necessary context. Thanks you
Create issues for the remaining tasks, tried adding a description based on my understanding of the issue. Also update the same for the older ones
sql_compound_identifier_to_expr
: https://github.com/apache/datafusion/issues/11244sql_substring_to_expr
: https://github.com/apache/datafusion/issues/11245sql_position_to_expr
: https://github.com/apache/datafusion/issues/11246Given how much UserDefinedSQLPlanner
is being used for existing stuff within datafusion, perhaps it should be called just SQLPlanner
or CustomSQLPlanner
?
Given how much
UserDefinedSQLPlanner
is being used for existing stuff within datafusion, perhaps it should be called justSQLPlanner
orCustomSQLPlanner
?
I agree
Or maybe something like ExprPlanner
🤔 as it is being used to plan specific exprs.
ExprPlanner
sounds good.
Given #11220 and #11243, those are very similar APIs with UDF plans. I am trying to draft an API, e.g.,
// Plan the user defined function, returns origin expression arguments if not possible
fn plan_udf(
&self,
_sql: &sqlparser::ast::Expr,
args: Vec<Expr>,
) -> Result<PlannerResult<Vec<Expr>>> {
Ok(PlannerResult::Original(args))
}
to uniform the usages.
I have created a draft PR #11263 to discuss this. The flaw here is that the parameter sql
is partially borrowed and has to be cloned at the very beginning. Maybe we should consider using references if possible.
Given #11220 and #11243, those are very similar APIs with UDF plans. I am trying to draft an API, e.g.,
// Plan the user defined function, returns origin expression arguments if not possible fn plan_udf( &self, _sql: &sqlparser::ast::Expr, args: Vec<Expr>, ) -> Result<PlannerResult<Vec<Expr>>> { Ok(PlannerResult::Original(args)) }
to uniform the usages.
I have created a draft PR #11263 to discuss this. The flaw here is that the parameter
sql
is partially borrowed and has to be cloned at the very beginning. Maybe we should consider using references if possible.
Eventually, I made this #11263, please let me know your thoughts. Thanks :)
CC @jayzhan211 @dharanad @alamb
ExprPlanner
sounds good.
I think we are pretty close to calling this done.
I just double checked and sql_compound_identifier_to_expr
is the only thing that needs this treatment to remove the call to get_function_meta
:
That appears to be the last issue https://github.com/search?q=repo%3Aapache%2Fdatafusion+get_function_meta+path%3A%2F%5Edatafusion%5C%2Fsql%5C%2F%2F&type=code
I think we can claim we are done 🎉
thanks everyone
Is your feature request related to a problem or challenge?
As discussed in https://github.com/apache/datafusion/issues/10534, @jayzhan211 added a
UserDefinedSQLPlanner
in https://github.com/apache/datafusion/pull/11180 so that the translation of certain SQL sytanx toLogicalPlan
s andExpr
s are not hard coded inSqlToRel
but instead are controlled by aUserDefinedSQLPlanner
Now that we have the pattern, we need to move the other remaining functionality that is hard coded (e.g. looking up a function "date_part" by name) in SqlToRel to the UserDefinedSQLPlanner
Describe the solution you'd like
To rewrite with sql planner
Describe alternatives you've considered
No response
Additional context
Discussion is here: https://github.com/apache/datafusion/issues/10534