Open sebastiw opened 6 years ago
@sebastiw thank you for your report. Were you able to reproduce this using VALUES in place of tables?
No I am not able to reproduce it with VALUES.
Could you reproduce it using TPCH schemas (TPCH connector)?
No, not reproducable with either TPCH, nor MySQL connector.
Problem is with HIVE it seems.
Could you provide table descriptions and partitioning?
Thank you for guiding me in the right direction.
I have some tables with two partition keys, e.g. date and customer id. The query will fail when trying to grab one of those fields if there is only one distinct value for that field. i.e. Selecting customer id will fail and selecting date will succeed in the following table. If we have another distinct customer id it will succeed as well it seems.
| Customer ID | Date |
| 322 | 2018-01-01 |
| 322 | 2018-01-02 |
| 322 | 2018-01-03 |
@sopel39 incidentally i was able to reproduce this in tests using tpch connector (https://github.com/prestodb/presto/pull/11596) but not with a true Presto server (also tried tpch connector). I don't understand this.
@sopel39 incidentally i was able to reproduce this in tests using tpch connector (#11596) but not with a true Presto server (also tried tpch connector). I don't understand this.
Local mode doesn't add exchanges. It's possible the rule that splits the aggregation into partial/final is kicking in.
Local mode doesn't add exchanges. It's possible the rule that splits the aggregation into partial/final is kicking in.
is it a bad thing? This improves functionality coverage, keeping local runner closer to real life.
Maybe the validator (the one that throws the "Final aggregation with default value not separated from ..."
exception) should be aware of local execution mode? or disabled in local runner?
You need a full server to run exchanges (i.e. DistributedQueryRunner). What’s concerning is that it happened with a real query - this could be due to the rule that splits aggs kicking in in a context where the aggregation runs in single-node mode.
I can also reproduce this with filters, instead of partition keys.
WITH query1 AS (SELECT a FROM table WHERE a=322 GROUP BY a) SELECT a FROM query1 GROUP BY GROUPING SETS ((a), ());
I.e. for tpch connector :
WITH a AS (SELECT nationkey FROM tpch.sf1.customer WHERE nationkey=21 GROUP BY nationkey) SELECT nationkey FROM a GROUP BY GROUPING SETS ((nationkey), ());
Any updates on this? Any more information I can provide?
We haven't had have time to pick it up yet, sorry. This is on my list, but if anybody else have time feel free to pick it up.
Any idea when it's planned to be fixed?
+1 on this, still seeing this too!
I get the error message "Final aggregation with default value not separated from partial aggregation by remote hash exchange" from https://github.com/prestodb/presto/blob/5dda212a416a9f04dff951e7ed45075998780749/presto-main/src/main/java/com/facebook/presto/sql/planner/sanity/ValidateAggregationsWithDefaultValues.java#L119
With the following minimized query example:
WITH query1 AS (SELECT a FROM table), query2 AS (SELECT a FROM query1 GROUP BY a) SELECT a FROM query2 GROUP BY GROUPING SETS ((a), ());
My actual query is larger, adding some calculations in query2 that depend on fields in query1, and then with filters in main select.
It seems that this should work. I'm running 0.206-146-gd3c9272f3d of Presto.