prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.75k stars 5.28k forks source link

Fix wrong results bug with count over mixed aggregation #23013

Closed rschlussel closed 2 weeks ago

rschlussel commented 2 weeks ago

Description

Fix a wrong results bug with count over an aggregation that had a mix of global and non-global grouping sets.

We were not checking for single global aggregations correctly in QueryCardinalityUtil, so we would return that the plan was scalar if there were any empty grouping sets rather than if the empty grouping set was the only grouping set. we have now fixed this to return that if all of the grouping sets are global, then the cardinality will be the number of grouping sets, and otherwise it is at least the number of global grouping sets.

This fixes queries like the following:

SELECT COUNT(*) FROM (SELECT count(*) FROM tpch.sf1.nation GROUP BY GROUPING SETS (nationkey, ()));

previously we would incorrectly return 1. And now we return 26.

This change may also fix bugs with correlated subqueries, as those also use the isScalar() utility function.

Motivation and Context

Fixes https://github.com/prestodb/presto/issues/22977

Impact

Fixes wrong results for queries with counts over mixed global and grouped aggregations .e.g. SELECT COUNT(*) FROM (SELECT count(*) FROM tpch.sf1.nation GROUP BY GROUPING SETS (nationkey, ()));

Test Plan

added unit tests

Contributor checklist

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Fix a bug where count queries on top of aggregations with mixed global and grouped grouping sets could return wrong results.