Open gatesn opened 1 year ago
Calcite's RexProgram
class is a bundle of project (and optionally filter) expressions. The expressions are topologically sorted so that common expressions are computed first. It might be a useful concept for substrait to borrow.
@julianhyde is RexProgram
itself an expression?
is RexProgram itself an expression?
Not exactly. It is a collection of expressions and a filter. If you squint you could regard it as an expression that returns an optional tuple.
But I do think it solves the requirement for common subexpressions admirably.
An ordinary "let" expression, without the ability to return tuples, is going to struggle with cases like this (in Standard ML-like pseudocode), with multiple outputs based on the same common subexpressions:
let
val x = y + z
in
{ b = a > x andalso b <= x, p = a + x, q = p + z }
end
and this one, where output is conditional on an intermediate expression:
let
val x = y + z
in
if a > x andalso b <= x then
emit { p = a + x, q = p + z }
end
@julianhyde If it isn't an expression then how do we use it? For example, a project is currently defined as:
message ProjectRel {
RelCommon common = 1;
Rel input = 2;
repeated Expression expressions = 3;
substrait.extensions.AdvancedExtension advanced_extension = 10;
}
Should we change field 3 to a repeated Program
message?
I believe this has been brought up a couple of times before as part of #287 and somewhat in #320, but I thought it would be worth discussing explicitly.
The idea is whether it should be possible to reference common subexpressions. Either using an explicit "let" expression operator, e.g.
let x = y + z; and(gt(a, x), lte(b, x))
or by pulling them to the plan-level as is done with the relation reference operator.