NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
797 stars 232 forks source link

[AUDIT][SPARK-43061][CORE][SQL] Introduce PartitionEvaluator for SQL operator execution #8759

Open parthosa opened 1 year ago

parthosa commented 1 year ago

A new API PartitionEvaluator is introduced which affects ProjectExec, FilterExec and WholeStageCodegenExec. We will have to check if any similar changes are required in our plugin.

Apache Commit - apache/spark@cabba33cb2

revans2 commented 1 year ago

This is not a requirement for us, but it would be nice to do something like this for all of our code. The idea is that using an "Evaluator" instead of a lambda means that you need to explicitly call out what goes into the closure for that operation. Which means we are much less likely to ship things we didn't expect (like the entire SQL plan).

abellina commented 1 year ago

Here is another instance of this, where the API was used for C2R and R2C in apache spark.

https://github.com/apache/spark/commit/56b9f6cd46