Open alex-spies opened 3 months ago
Pinging @elastic/es-analytical-engine (Team:Analytics)
SubstituteSurrogates
does something ok now, but considering https://github.com/elastic/elasticsearch/issues/100634 this rule should be executed multiple times instead. PropagateEvalFoldables
(when enabled for aggregates as well) should cover cases where the foldable expression is not inside the aggregate, for example eval x = [5,6,7] | stats max(x)
. And, when PropagateEvalFoldables
is executed, the SubstituteSurrogates
rule is no longer executed.
In the substitutions batch of our LogicalPlanOptimizer, there's 4 rules that take an expression like
| STATS foo = avg(x*x) + 2
and turn this into a simple aggregation with enclosingEVAL
s; in this example, this becomes (essentially)This is becoming complicated and more difficult to argue about due to the substitutions happening in 4 rules; let's see if we can do with just 2 rules.
More specifically,
ReplaceStatsNestedExpressionWithEval
turnsSTATS avg(x*x) + 2
intoEVAL $$x = x*x | STATS foo = avg($$x) + 2
.ReplaceStatsAggExpressionWithEval
then turns| STATS foo = avg($$x) + 2
into| STATS $$foo = avg($$x) | EVAL foo = $$foo + 2
SubstituteSurrogates
replaces| STATS $$foo = avg($$x)
by| STATS $$foo_sum = sum($$x), $$foo_count = count($$x) | EVAL $$foo = $$foo_sum/$$foo_count
ReplaceStatsNestedExpressionWithEval
again to account for stuff that happened inTranslateMetricsAggregate
It makes sense that there's 1 rule that creates
EVAL
s after the aggregation (ReplaceStatsNestedExpressionWithEval
) and one that pulls nested expressions out of agg functions into anEVAL
before the aggregation (ReplaceStatsAggExpressionWithEval
).SubstituteSurrogates
should only substitute and letReplaceStatsNestedExpressionWithEval
handle creating theEVAL
after theSTATS
.ReplaceStatsNestedExpressionWithEval
afterTranslateMetricsAggregate