Open andygrove opened 1 month ago
take
Hi, The benchmark which we were running in apache datafusion comet :
let expr = Arc::new(
CaseExpr::try_new(
None,
vec![(predicate.clone(), make_col("c2", 1))],
Some(make_col("c3", 2)),
)
.unwrap(),
);
Whose, eval method is turning out to be NoExpression
I tried comparing the code for NoExpression and Expression and other methods. I didn't find much of a difference.
Can someone please guide me, on how to approach the performance optimization part of this scenario?
Is your feature request related to a problem or challenge?
In DataFusion Comet, we have a custom
IfExpr
forIF(condition, true_expr, false_expr)
. In https://github.com/apache/datafusion-comet/pull/681 we removed itsevaluate
implementation and instead delegate toCaseExpr
. This resulted in great performance improvements for the "column or null" and "scalar or scalar" cases thanks to recent optimizations in DataFusion, but resulted in a small regression for the "expr or expr" case.Describe the solution you'd like
I would like to see if we can optimize for the "expr or expr" cases, learning from the original
IfExpr
implementation code.Describe alternatives you've considered
No response
Additional context
No response