Open Novemser opened 6 years ago
scala> spark.sql("select count(tp_int),avg(tp_int) from full_data_type_table").explain 17/12/28 14:17:13 INFO SparkSqlParser: Parsing command: select count(tp_int),avg(tp_int) from full_data_type_table == Physical Plan == *HashAggregate(keys=[], functions=[sum(count(tp_int#100L)#386L), sum(sum(tp_int#100L)#387L), sum(count(tp_int#100L)#386L)]) +- Exchange SinglePartition +- *HashAggregate(keys=[], functions=[partial_sum(count(tp_int#100L)#386L), partial_sum(sum(tp_int#100L)#387L), partial_sum(count(tp_int#100L)#386L)]) +- TiDB CoprocessorRDD{[table: full_data_type_table] , Ranges: Start:[-9223372036854775808], End: [9223372036854775807], Columns: [tp_int], Aggregates: Count([tp_int]), Sum([tp_int]), Count([tp_int])}
Actually, Aggregates: Count([tp_int]), Sum([tp_int]), Count([tp_int]) could be optimized to Aggregates: Count([tp_int]), Sum([tp_int])
Aggregates: Count([tp_int]), Sum([tp_int]), Count([tp_int])
Aggregates: Count([tp_int]), Sum([tp_int])
How does this happen? Don't we have a PR that optimized this?
Seems still a problem. Leave it open for now.
Actually,
Aggregates: Count([tp_int]), Sum([tp_int]), Count([tp_int])
could be optimized toAggregates: Count([tp_int]), Sum([tp_int])