NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

[FEA] Implement Shim inclusion / activation after dist assembly is published #11745

Open gerashegalov opened 23 hours ago

gerashegalov commented 23 hours ago

Is your feature request related to a problem? Please describe.

Our dist assembly is complex, maybe ready to be simplified which will be subject of another issue.

At this time, the set of shims participating in the assembly determines whether binary-dedupe.sh may move class files into a shared location or not.

It may change whether a class from sql-plugin-api passes the bitwise-identity test and if not the build fails.

The current practice of configuring the dist differently for the nightly and the release CI pipelines results in this kind of issue. It will allow us to avoid last minute pre-release issues related to the fact that the release version is not the same layout as has been tested for weeks with nightly pipelines.

Describe the solution you'd like This issue proposes to implement configs to the tune of

sparkXYZ.ShimServiceProvider.enabled

that is honored at run time.

This should allow to make sure that the dist assembly stays constant throughout the entire lifecycle of a Shim including its introduction as a SNAPSHOT shim first before the release.

Describe alternatives you've considered continue the old way

Additional context

11744