Open alamb opened 6 months ago
I can do this one since I have been doing a related limit pushdown feature.
wait until https://github.com/apache/arrow-datafusion/pull/9815#issue-2209595815 merged
This one is probably ready to work on now
While re-reading this I think we should start with implementhing limits that can be evaluated to a constant by the time the physical plan is created (aka don't change the physical execution plans)
It is not clear to me what LIMIT 100 + x
means
The key usecases are:
LIMIT $1
-- parameters like https://github.com/apache/datafusion/issues/12294LIMIT 8 * 1024
-- expressions that evaluate to constantsWhile re-reading this I think we should start with implementhing limits that can be evaluated to a constant by the time the physical plan is created (aka don't change the physical execution plans)
I will switch my focus to working on it.
Follow on to https://github.com/apache/arrow-datafusion/issues/9506
The idea is to support arbitrary expressions that can be consolidated to a constant in the LIMIT clause. For example
This query should be able to run (and return the single value)
https://github.com/apache/arrow-datafusion/pull/9790 adds support for basic
+/-
but the general purpose solution that would handle any expr that can be consolidated to a constant would be betterAs suggested by @jonahgao this might look like change the
Limit
logical plan to support arbitrary expressions?The
SimplifyExpressions
rule can automatically optimize them into constants. Some optimization rules such asPushDownLimit
only run when the limit expression is a constant. We may need to add a cast for the limit expression when planning, only checking if it is a constant of type u64.When creating the
LimitExec
physical plan, convert the limit expression intoPhysicalExpr
and evaluate it._Originally posted by @jonahgao in https://github.com/apache/arrow-datafusion/pull/9790#discussion_r1539358701_