NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

Make delta-lake shim dependencies parametrizable [databricks] #11697

Closed gerashegalov closed 1 week ago

gerashegalov commented 2 weeks ago

Introduce properties in the parent pom that Spark shim profiles can override to specify the set of delta-lake shims for a particular Spark shim.

Add a single reusable array of delta-lake shim dependencies in the aggregator pom. Relies on Maven deduping dependencies.

Drop a verbose mirror of the Spark release profiles from the aggregator pom

Fix ./build/make-scala-version-build-files.sh that currently can silently fail without fully processing poms

Context https://github.com/NVIDIA/spark-rapids/pull/11692#discussion_r1829488557

gerashegalov commented 2 weeks ago

build

gerashegalov commented 2 weeks ago

build

gerashegalov commented 2 weeks ago

Added a check for a proper override

15:12:07,938 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.5.0:enforce (enforce-maven) on project rapids-4-spark-parent_2.13: 
15:12:07,938 [ERROR] Rule 1: org.apache.maven.enforcer.rules.property.RequireProperty failed with message:
15:12:07,938 [ERROR] At least one of rapids.delta.artifactId1, rapids.delta.artifactId2 ... is required in the POM profile "release344"
gerashegalov commented 2 weeks ago

build

gerashegalov commented 2 weeks ago

build

jlowe commented 2 weeks ago

build

gerashegalov commented 2 weeks ago

build

gerashegalov commented 2 weeks ago

build

pxLi commented 2 weeks ago

[2024-11-07T20:54:31.725Z] + echo 'Done with installation of Databricks dependencies, removing /tmp/install-databricks-deps-IdpEJ2-pom.xml'

[2024-11-07T20:54:31.725Z] Done with installation of Databricks dependencies, removing /tmp/install-databricks-deps-IdpEJ2-pom.xml

[2024-11-07T20:54:31.725Z] + rm /tmp/install-databricks-deps-IdpEJ2-pom.xml

[2024-11-07T20:54:31.725Z] + [[ '' == \1 ]]

[2024-11-07T20:54:31.725Z] + MVN_PHASES='clean package'

[2024-11-07T20:54:31.725Z] + mvn -Dmaven.wagon.http.retryHandler.count=3 -B -Ddatabricks -Dbuildver=332db clean package -DskipTests

[2024-11-07T20:54:32.656Z] [INFO] Scanning for projects...

[2024-11-07T20:54:32.656Z] [ERROR] [ERROR] Some problems were encountered while processing the POMs:

[2024-11-07T20:54:32.656Z] [ERROR] 'dependencies.dependency.artifactId' for com.nvidia:rapids-4-spark-delta-${spark.version.classifer}_2.12:jar:spark332db with value 'rapids-4-spark-delta-${spark.version.classifer}_2.12' does not match a valid id pattern. @ line 80, column 25

[2024-11-07T20:54:32.656Z] [ERROR] 'dependencies.dependency.artifactId' for com.nvidia:rapids-4-spark-delta-${spark.version.classifer}_2.12:jar:spark332db with value 'rapids-4-spark-delta-${spark.version.classifer}_2.12' does not match a valid id pattern. @ line 86, column 25

[2024-11-07T20:54:32.656Z] [ERROR] 'dependencies.dependency.artifactId' for com.nvidia:rapids-4-spark-delta-${spark.version.classifer}_2.12:jar:spark332db with value 'rapids-4-spark-delta-${spark.version.classifer}_2.12' does not match a valid id pattern. @ line 92, column 25

[2024-11-07T20:54:32.656Z] [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.apache.maven.plugins:maven-antrun-plugin @ line 521, column 21

[2024-11-07T20:54:32.656Z]  @ 

[2024-11-07T20:54:32.656Z] [ERROR] The build could not read 1 project -> [Help 1]

[2024-11-07T20:54:32.656Z] [ERROR]   

[2024-11-07T20:54:32.656Z] [ERROR]   The project com.nvidia:rapids-4-spark-aggregator_2.12:24.12.0-SNAPSHOT (/home/ubuntu/spark-rapids/aggregator/pom.xml) has 3 errors

[2024-11-07T20:54:32.656Z] [ERROR]     'dependencies.dependency.artifactId' for com.nvidia:rapids-4-spark-delta-${spark.version.classifer}_2.12:jar:spark332db with value 'rapids-4-spark-delta-${spark.version.classifer}_2.12' does not match a valid id pattern. @ line 80, column 25

[2024-11-07T20:54:32.656Z] [ERROR]     'dependencies.dependency.artifactId' for com.nvidia:rapids-4-spark-delta-${spark.version.classifer}_2.12:jar:spark332db with value 'rapids-4-spark-delta-${spark.version.classifer}_2.12' does not match a valid id pattern. @ line 86, column 25

[2024-11-07T20:54:32.656Z] [ERROR]     'dependencies.dependency.artifactId' for com.nvidia:rapids-4-spark-delta-${spark.version.classifer}_2.12:jar:spark332db with value 'rapids-4-spark-delta-${spark.version.classifer}_2.12' does not match a valid id pattern. @ line 92, column 25

[2024-11-07T20:54:32.656Z] [ERROR] 

[2024-11-07T20:54:32.656Z] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[2024-11-07T20:54:32.656Z] [ERROR] Re-run Maven using the -X switch to enable full debug logging.
gerashegalov commented 2 weeks ago

build