NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
788 stars 228 forks source link

[FEA] Add testing for stage level scheduling #175

Open sameerz opened 4 years ago

sameerz commented 4 years ago

Is your feature request related to a problem? Please describe. Spark 3.1 includes stage level scheduling: https://issues.apache.org/jira/browse/SPARK-27495 . Add tests cases to configure Spark running with stage level scheduling enabled.

Describe the solution you'd like Run the spark-rapids plugin with stage level scheduling enabled for a subset of integration tests and validate the output.

Describe alternatives you've considered We could run with all tests enabled and stage level scheduling enabled. We can start with a subset.

Additional context https://issues.apache.org/jira/browse/SPARK-27495

revans2 commented 4 years ago

I think a part of this is going to be making our code work with stage level scheduling. It is simple to say if the entire job needs GPUs or not, but on the GPU side we need to make sure that the operators can tell spark for each stage that we translated something to the GPU that it is needed.

sameerz commented 4 years ago

After discussion, the right thing to do is create an example where we can demonstrate running an ETL job with the plugin and then in a subsequent stage running still on the GPU but without the plugin. For example, an ETL job followed by an XGBoost job reading from Parquet files. In between stages the configuration of the executors will change.

This can then be turned into a test case.