When a filter removes all rows before a STATS, most of our aggregations return null:
row x = 0 | where x > 0 | eval y = 1 | stats avg(y)
avg(y)
---------------
null
Aggregations over constants are inconsistent with this, they always evaluate to some value:
row x = 0 | where x > 0 | stats avg(1)
avg(1)
---------------
1.0
While this is inconsistent, I'm not sure if the consistent behavior for aggregations over constants is actually desirable; this would require evaluating the entire first part of the query upstream from the STATS just to see if there's at least one row, which is wasteful.
In case of AVG, the correct surrogate expression for AVG([1,2,3]) would probably be CASE(COUNT(*) > 0, 1, null) * MV_AVG([1,2,3]) rather than just MV_AVG([1,2,3]).
When a filter removes all rows before a
STATS
, most of our aggregations returnnull
:Aggregations over constants are inconsistent with this, they always evaluate to some value:
While this is inconsistent, I'm not sure if the consistent behavior for aggregations over constants is actually desirable; this would require evaluating the entire first part of the query upstream from the
STATS
just to see if there's at least one row, which is wasteful.In case of AVG, the correct surrogate expression for
AVG([1,2,3])
would probably beCASE(COUNT(*) > 0, 1, null) * MV_AVG([1,2,3])
rather than justMV_AVG([1,2,3])
.