apache / amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
https://amoro.apache.org/
Apache License 2.0
748 stars 261 forks source link

[Bug]: "spark.executor.userClassPathFirst=true" will cause a class not found error. #2927

Closed lintingbin closed 1 week ago

lintingbin commented 3 weeks ago

What happened?

When using the Spark optimizer, amoro set spark.executor.userClassPathFirst=true by default. This will cause a class not found error when Spark executes optimization tasks.

Affects Versions

master

What table format are you seeing the problem on?

Iceberg

What engines are you seeing the problem on?

Optimizer

How to reproduce

No response

Relevant log output

Job aborted due to stage failure: Task 0 in stage 29.0 failed 4 times, most recent failure: Lost task 0.3 in stage 29.0 (TID 119) (task-6-5.c-10c76e23f1418d1d.cn-shanghai.emr.aliyuncs.com executor 1): java.lang.NoClassDefFoundError: Could not initialize class org.apache.amoro.shade.thrift.org.apache.thrift.transport.TIOStreamTransport
    at org.apache.amoro.api.OptimizingTask.readObject(OptimizingTask.java:485)
    at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1184)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2321)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2212)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668)
    at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2118)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1656)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2430)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2354)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2212)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2430)
    at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:632)
    at org.apache.spark.rdd.ParallelCollectionPartition.$anonfun$readObject$1(ParallelCollectionRDD.scala:73)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1470)
    at org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:69)
    at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1184)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2321)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2212)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2430)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2354)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2212)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:502)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:460)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:87)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:129)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:507)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)

Driver stacktrace:

Anything else

No response

Are you willing to submit a PR?

Code of Conduct

zhoujinsong commented 1 week ago

Closed by #2950.