apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
615 stars 113 forks source link

Support SubqueryBroadcastExec in Comet #242

Open viirya opened 2 months ago

viirya commented 2 months ago

What is the problem the feature request solves?

Currently we support BroadcastExchange if it is under BroadcastHashJoin in Comet.

Besides broadcast join, BroadcastExchange can also occur under SubqueryBroadcastExec for dynamic pruning expression. Because Spark BroadcastExchange assumes it is row-based so we cannot simply transform BroadcastExchange under SubqueryBroadcastExec into Comet's broadcast operator.

We need to come out with a Comet SubqueryBroadcastExec which supports Comet's BroadcastExchange.

Describe the potential solution

No response

Additional context

No response

ganeshkumar269 commented 1 month ago

Hi @viirya, I would like to work on this. Seems like this might require some knowledge of spark codebase aswell, nevertheless will give it a shot, your guidance will be helpful 🙏🏾