oap-project / gazelle_plugin

Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
Apache License 2.0
256 stars 77 forks source link

wrong results for hashagg #1104

Open zhouyuan opened 2 years ago

zhouyuan commented 2 years ago

Describe the bug This is a rare case of hashagg over multiple columns, as the keys are combined into one "row" like String, some different rows will be counted into one group

To Reproduce

group by col_a, col_b

col_a col_b
"ab" "ab"
"a" "bab"
"aba" "b"
"abab" ""

Expected behavior fix the hash conflict in the above case

Additional context N/A