apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
823 stars 163 forks source link

chore: Don't transform the HashAggregate to CometHashAggregate if Comet shuffle is disabled #991

Closed viirya closed 1 month ago

viirya commented 1 month ago

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

How are these changes tested?

mbutrovich commented 1 month ago

Do the benchmarks in https://github.com/apache/datafusion-comet/blob/84cccf7dba6be52dc7901c7e7fe47c51a9a5b35a/spark/src/test/scala/org/apache/spark/sql/benchmark/CometAggregateBenchmark.scala#L4 need to be updated? Part of my exploration today (that motivates this PR) is that the CometBaseBenchmark and by extension the CometAggregateBenchmark does not set Comet at the shuffle manager. CometExecBenchmark does set the shuffle manager: https://github.com/apache/datafusion-comet/blob/84cccf7dba6be52dc7901c7e7fe47c51a9a5b35a/spark/src/test/scala/org/apache/spark/sql/benchmark/CometExecBenchmark.scala#L44

viirya commented 1 month ago

Yea, we should update CometAggregateBenchmarks to enable shuffle for it.

mbutrovich commented 1 month ago

This should help a lot for #987! Thanks for the quick fix!

viirya commented 1 month ago

Thanks @kazuyukitanimura @mbutrovich