Open edward-capriolo-db opened 3 months ago
Many projects have similar issues from the spark api change
Thanks for the report!
[INFO] | | +- org.apache.spark:spark-network-common_2.12:jar:3.5.1:compile [INFO] | | | - com.google.crypto.tink:tink:jar:1.9.0:compile [INFO] | | | +- com.google.code.gson:gson:jar:2.8.9:compile [INFO] | | | +- com.google.protobuf:protobuf-java:jar:3.19.6:compile [INFO] | | | - joda-time:joda-time:jar:2.12.5:compile
Also dependencies are bringing in a protobuf that is old that sets off OSS vulnerability scanning. [INFO] +- org.elasticsearch:elasticsearch-spark-30_2.12:jar:8.13.0:compile [INFO] | +- org.scala-lang:scala-reflect:jar:2.12.17:compile [INFO] | +- commons-logging:commons-logging:jar:1.1.1:compile [INFO] | +- javax.xml.bind:jaxb-api:jar:2.3.1:runtime [INFO] | - com.google.protobuf:protobuf-java:jar:2.5.0:compile
Upgrading to Spark 3.5 is going to be tricky because of compiler errors like this caused by a breaking change in the spark API:
[Error] /Users/kmassey/workspace/elasticsearch-hadoop/spark/core/src/main/scala/org/elasticsearch/spark/package.scala:34:42: Symbol 'type org.apache.spark.internal.Logging' is missing from the classpath.
This symbol is required by 'class org.apache.spark.SparkContext'.
Make sure that type Logging is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'SparkContext.class' was compiled against an incompatible version of org.apache.spark.internal.
[Error] /Users/kmassey/workspace/elasticsearch-hadoop/spark/core/src/main/scala/org/elasticsearch/spark/rdd/EsSpark.scala:25:8: Symbol 'type org.apache.spark.internal.Logging' is missing from the classpath.
This symbol is required by 'class org.apache.spark.rdd.RDD'.
Make sure that type Logging is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'RDD.class' was compiled against an incompatible version of org.apache.spark.internal.
[Error] /Users/kmassey/workspace/elasticsearch-hadoop/spark/core/src/main/scala/org/elasticsearch/spark/cfg/SparkSettingsManager.java:21:8: Symbol 'type org.apache.spark.internal.Logging' is missing from the classpath.
This symbol is required by 'class org.apache.spark.SparkConf'.
Make sure that type Logging is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'SparkConf.class' was compiled against an incompatible version of org.apache.spark.internal.
three errors found
I think we'll have to move several more classes from our spark core package down into the various spark-version-specific packages.
These are unavoidable, previously in Hive we had made "shim layers" and used reflection to deal with breaking API changes. I will look into at least getting it working and then we can see what the change set is.
Is there any updates on that?
What kind an issue is this?
The easier it is to track down the bug, the faster it is solved.
Often a solution already exists! Don’t send pull requests to implement new features without first getting our support. Sometimes we leave features out on purpose to keep the project small.
Issue description
Spark 3.5.1 has changed some UDF code in catalyst which breaks a number of applications built against older versions of spark
Steps to reproduce
Code:
Strack trace:
Version Info
OS: : Linux JVM : JDK8/11 Hadoop/Spark:
ES-Hadoop :
ES : 7.X latest.
Feature description