Open sergeyshaykhullin opened 11 months ago
For only group by product_external_id we only need build a hash map with 200 distinct values. but group by a, order by b limit we need collect all input from upstream.
It's not a bug. closed.
@stdpain But there is a predicate product_external_id = 123
That means that hash map has only 1 element
And second case should order by single row
@sergeyshaykhullin can u provide the query profile for then?
@stdpain Queries and schema are in issue, 3 and 4
@sergeyshaykhullin can u run with set enable_parallel_merge=false;
similar with https://github.com/StarRocks/starrocks/pull/35899
With
order by
query profile contains extra data shuffling, butwhere
+group by
retain just single rowBelow just an example with required columns and single day of aggregation, in production environment +- 40 colums and wide date ranges
Query time drops from 400ms to 1800+ms with the same result
Steps to reproduce the behavior (Required)
1.
Populate with data
400ms
1800+ms
Expected behavior (Required)
Ordering happen with no cost on single row and without extra shuffilng
Real behavior (Required)
StarRocks version (Required)
3.2.2-269e832
Non default variables