vertica / spark-connector

This component acts as a bridge between Spark and Vertica, allowing the user to either retrieve data from Vertica for processing in Spark, or store processed data from Spark into Vertica.
Apache License 2.0
20 stars 23 forks source link

Daily build fails on standalone cluster due to NoSuchMethodError #547

Closed jeremyprime closed 1 year ago

jeremyprime commented 1 year ago

Problem Description

Daily tests on a standalone cluster are failing (started May 29).

No code changes occurred recently, so likely due to a Docker image or third-party dependency that was updated. Most likely the Bitnami Spark image, which appears to have recently moved to Spark 3.4.0. This may become a larger story to support Spark 3.4.0 (see #549), so may want to pin tests to use Spark 3.3.x for now.


Spark Connector Logs

Complex Types Tests failure:

23/06/02 09:32:35 ERROR Main$: Uncaught exception from tests: 'java.lang.String org.apache.spark.sql.execution.datasources.PartitionedFile.filePath()'
java.lang.NoSuchMethodError: 'java.lang.String org.apache.spark.sql.execution.datasources.PartitionedFile.filePath()'
    at com.vertica.spark.datasource.wrappers.VerticaScanWrapper.$anonfun$planInputPartitions$1(VerticaScanWrapper.scala:40)
jeremyprime commented 1 year ago

See method signature of PartitionedFile.filePath change from 3.3.2 to 3.4.0: https://javadoc.io/static/org.apache.spark/spark-sql_2.12/3.3.2/org/apache/spark/sql/execution/datasources/PartitionedFile.html https://javadoc.io/static/org.apache.spark/spark-sql_2.12/3.4.0/org/apache/spark/sql/execution/datasources/PartitionedFile.html

jeremyprime commented 1 year ago

If pinning the version to 3.3.x, see the following files where we use latest (or reference that fact):

Updating docker-compose.yml and README.md is enough as the nightly tests rely on the default version in that file.

Find the latest Spark 3.3.x tag for the Bitnami Spark image here.