prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.93k stars 5.33k forks source link

Query failure with "Final aggregation with default value not separated from partial aggregation by remote hash exchange" #11484

Open sebastiw opened 6 years ago

sebastiw commented 6 years ago

I get the error message "Final aggregation with default value not separated from partial aggregation by remote hash exchange" from https://github.com/prestodb/presto/blob/5dda212a416a9f04dff951e7ed45075998780749/presto-main/src/main/java/com/facebook/presto/sql/planner/sanity/ValidateAggregationsWithDefaultValues.java#L119

With the following minimized query example: WITH query1 AS (SELECT a FROM table), query2 AS (SELECT a FROM query1 GROUP BY a) SELECT a FROM query2 GROUP BY GROUPING SETS ((a), ());

My actual query is larger, adding some calculations in query2 that depend on fields in query1, and then with filters in main select.

It seems that this should work. I'm running 0.206-146-gd3c9272f3d of Presto.

findepi commented 6 years ago

@sebastiw thank you for your report. Were you able to reproduce this using VALUES in place of tables?

sebastiw commented 6 years ago

No I am not able to reproduce it with VALUES.

sopel39 commented 6 years ago

Could you reproduce it using TPCH schemas (TPCH connector)?

sebastiw commented 6 years ago

No, not reproducable with either TPCH, nor MySQL connector.

Problem is with HIVE it seems.

sopel39 commented 6 years ago

Could you provide table descriptions and partitioning?

sebastiw commented 6 years ago

Thank you for guiding me in the right direction.

I have some tables with two partition keys, e.g. date and customer id. The query will fail when trying to grab one of those fields if there is only one distinct value for that field. i.e. Selecting customer id will fail and selecting date will succeed in the following table. If we have another distinct customer id it will succeed as well it seems.

| Customer ID | Date       |
| 322         | 2018-01-01 |
| 322         | 2018-01-02 |
| 322         | 2018-01-03 |
findepi commented 5 years ago

@sopel39 incidentally i was able to reproduce this in tests using tpch connector (https://github.com/prestodb/presto/pull/11596) but not with a true Presto server (also tried tpch connector). I don't understand this.

martint commented 5 years ago

@sopel39 incidentally i was able to reproduce this in tests using tpch connector (#11596) but not with a true Presto server (also tried tpch connector). I don't understand this.

Local mode doesn't add exchanges. It's possible the rule that splits the aggregation into partial/final is kicking in.

findepi commented 5 years ago

Local mode doesn't add exchanges. It's possible the rule that splits the aggregation into partial/final is kicking in.

is it a bad thing? This improves functionality coverage, keeping local runner closer to real life. Maybe the validator (the one that throws the "Final aggregation with default value not separated from ..." exception) should be aware of local execution mode? or disabled in local runner?

martint commented 5 years ago

You need a full server to run exchanges (i.e. DistributedQueryRunner). What’s concerning is that it happened with a real query - this could be due to the rule that splits aggs kicking in in a context where the aggregation runs in single-node mode.

sebastiw commented 5 years ago

I can also reproduce this with filters, instead of partition keys.

WITH query1 AS (SELECT a FROM table WHERE a=322 GROUP BY a) SELECT a FROM query1 GROUP BY GROUPING SETS ((a), ());

I.e. for tpch connector : WITH a AS (SELECT nationkey FROM tpch.sf1.customer WHERE nationkey=21 GROUP BY nationkey) SELECT nationkey FROM a GROUP BY GROUPING SETS ((nationkey), ());

sebastiw commented 5 years ago

Any updates on this? Any more information I can provide?

sopel39 commented 5 years ago

We haven't had have time to pick it up yet, sorry. This is on my list, but if anybody else have time feel free to pick it up.

OmerJog commented 4 years ago

Any idea when it's planned to be fixed?

dwootton commented 3 years ago

+1 on this, still seeing this too!