apache / amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
https://amoro.apache.org/
Apache License 2.0
748 stars 261 forks source link

[Bug]: "spark.driver.userClassPathFirst=true" will cause class conflicts. #2926

Closed lintingbin closed 2 weeks ago

lintingbin commented 3 weeks ago

What happened?

When using the Spark optimizer, amoro set spark.driver.userClassPathFirst=true by default, which causes Spark to fail to start. Removing this parameter allows it to start successfully. The specific error is as follows:

Affects Versions

master

What table format are you seeing the problem on?

Iceberg

What engines are you seeing the problem on?

Optimizer

How to reproduce

No response

Relevant log output

root@amoro-5599c7d4c7-mq9hc:/usr/local/amoro# export HADOOP_CONF_DIR=/opt/hadoop/config/ && export SPARK_HOME=/opt/spark && export SPARK_CONF_DIR=/opt/hadoop/config/ && export HADOOP_USER_NAME=hive && /opt/spark/bin/spark-submit --master yarn --deploy-mode=cluster --conf spark.dynamicAllocation.shuffleTracking.enabled=true --conf spark.executor.userClassPathFirst=true --conf spark.shuffle.service.enabled=false --conf spark.dynamicAllocation.maxExecutors=1 --conf spark.driver.userClassPathFirst=true --conf spark.dynamicAllocation.enabled=true --proxy-user hive --class org.apache.amoro.optimizer.spark.SparkOptimizer /opt/spark/usrlib/optimizer-job.jar  -a thrift://127.0.0.1:1261 -p 1 -g hive-test-group -id 6bsl4lq5qk9nq95qo7pt4tlsuv
Exception in thread "main" java.lang.LinkageError: loader constraint violation: when resolving method "org.slf4j.impl.StaticLoggerBinder.getLoggerFactory()Lorg/slf4j/ILoggerFactory;" the class loader (instance of org/apache/spark/util/ChildFirstURLClassLoader) of the current class, org/slf4j/LoggerFactory, and the class loader (instance of sun/misc/Launcher$AppClassLoader) for the method's defining class, org/slf4j/impl/StaticLoggerBinder, have different Class objects for the type org/slf4j/ILoggerFactory used in the signature
    at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:423)
    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:362)
    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:388)
    at org.apache.hadoop.conf.Configuration.<clinit>(Configuration.java:228)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2625)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:98)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:81)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:139)
    at org.apache.hadoop.yarn.client.RMProxy.createNonHaRMFailoverProxyProvider(RMProxy.java:169)
    at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:132)
    at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103)
    at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:73)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:242)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:192)
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1327)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1764)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:984)
    at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:175)
    at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:173)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:173)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:214)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1072)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1081)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
24/06/14 08:48:02 INFO ShutdownHookManager: Shutdown hook called
24/06/14 08:48:02 INFO ShutdownHookManager: Deleting directory /tmp/spark-a06eb0af-184c-456b-b63f-c4cabada0819

Anything else

No response

Are you willing to submit a PR?

Code of Conduct