Open crbl1122 opened 6 months ago
I tested locally different queries and here are the results. I guess total
and avg1
, avg2
columns are somehow related and together lead to an error in certain cases.
Successful:
SELECT (f_int2 + f_int2) as total, AVG(f_int) OVER (w ROWS 2 PRECEDING) as avg1 FROM PCOLLECTION WINDOW w AS (ORDER BY f_double asc)
SELECT AVG(f_int) OVER (w ROWS 2 PRECEDING) as avg1, AVG(f_int2) OVER (w ROWS 2 PRECEDING) as avg2 FROM PCOLLECTION WINDOW w AS (ORDER BY f_double asc)
SELECT (f_int2 + f_int2) as total, AVG(f_int) OVER (w ROWS 2 PRECEDING) as avg1, AVG(f_int) OVER (w ROWS 2 PRECEDING) as avg2 FROM PCOLLECTION WINDOW w AS (ORDER BY f_double asc)
With error:
SELECT (f_int + f_int) as total, AVG(f_int) OVER (w ROWS 2 PRECEDING) as avg1 FROM PCOLLECTION WINDOW w AS (ORDER BY f_double asc)
SELECT (f_int + f_int) as total, AVG(f_int2) OVER (w ROWS 2 PRECEDING) as avg1 FROM PCOLLECTION WINDOW w AS (ORDER BY f_double asc)
SELECT (f_int2 + f_int2) as total, AVG(f_int) OVER (w ROWS 2 PRECEDING) as avg1, AVG(f_int2) OVER (w ROWS 2 PRECEDING) as avg2 FROM PCOLLECTION WINDOW w AS (ORDER BY f_double asc)
What happened?
I use SqlTransform component in an Apache Beam pipeline running in DataFlow. If I add one more variable in the SQL query, I get the error:
RuntimeError: org.apache.beam.sdk.extensions.sql.impl.SqlConversionException: Unable to convert query .... Caused by: org.apache.beam.vendor.calcite.v1_28_0.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not enough rules to produce a node with desired properties: convention=BEAM_LOGICAL. All the inputs have relevant nodes, however the cost is still infinite.
So, this query is working:
While, this query is not working:
The difference between these two queries is only one line:
AVG(NUM_2) OVER (w ROWS 2 PRECEDING) AS NUM_2_sliding_3M
Pipeline definition:
Why the SqlTransform does not accept more than one rolling average computation?
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components