Closed kgpai closed 8 months ago
Adding repro and further details soon.
@kgpai Is this due to multimap_agg being order-sensitive?
presto:di> select multimap_agg(x, y) from unnest(array[1, 1, 1], array[1, 10, 3]) as t(x, y);
_col0
----------------
{1=[1, 10, 3]}
presto:di> select multimap_agg(x, y order by y desc) from unnest(array[1, 1, 1], array[1, 10, 3]) as t(x, y);
_col0
----------------
{1=[10, 3, 1]}
FYI, Currently there are some problems to repro this :
I will create a PR for 2&3, and I will also update 1 with support for AggregationRunnerTest.
Finally I was only able to see this happen on 1 run of AggregationFuzzer against Presto, I am going to see if I can reproduce this problem by running a few more times for some hours.
It seems the reason this bug occurs is because PrestoQueryRunner currently doesnt support sorting keys or Order in aggregates. See here : https://github.com/facebookincubator/velox/blob/main/velox/exec/fuzzer/PrestoQueryRunner.cpp#L214 . For e.g in this case we produce queries like : 'SELECT g0, g1, g2, multimap_agg(c0, c1) as a0 FROM tmp GROUP BY g0, g1, g2' . This means that aggregations which are sensitive to order might not match.
cc: @duanmeng
Fixed this with #8233 . Closing.
Description
Using Presto as source of truth we see that the results returned for Velox and Presto are different.
Error Reproduction
See attached plan file. plan_nodes.zip
Relevant logs