AbsaOSS / ABRiS

Avro SerDe for Apache Spark structured APIs.
Apache License 2.0
227 stars 73 forks source link

Error: ClassNotFoundException: za.co.absa.abris.avro.read.confluent.SchemaManager #315

Closed Rap70r closed 1 year ago

Rap70r commented 1 year ago

Hello,

I'm getting below error when running Spark submit job:

ClassNotFoundException: za.co.absa.abris.avro.read.confluent.SchemaManager

Spark: 3.2.1

spark-submit --repositories http://packages.confluent.io/maven/ --jars s3://bucket/jars/abris.jar,s3://bucket/jars/hudi-spark-bundle.jar --packages org.apache.spark:spark-avro_2.12:3.2.1,org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1,io.confluent:kafka-schema-registry-client:5.3.4,io.confluent:kafka-avro-serializer:5.3.4,org.apache.kafka:kafka_2.12:3.3.1 --class some_class s3://bucket/jars/my_app.jar

I'm using the latest library from maven. Any idea how to solve this issue?

Thank you

cerveada commented 1 year ago

SchemaManager class should be available inside Abris jar, but it seems in your case s3://bucket/jars/abris.jar doesn't contain it. I don't know why that would happen.

Can you manually open the jar and check the class is there?

Rap70r commented 1 year ago

Hi @cerveada,

I extracted the code and I can see the class in above bath. Could the error be related to something else? Can this be because the library versions are not aligned?

Thank you

cerveada commented 1 year ago

Can you provide the whole exception with a stack trace? Or maybe the whole log of that run? You can upload files and it may help to find the issue here.

Rap70r commented 1 year ago

Hi @cerveada,

Unfortunately, there is not much info in the stack trace:

    INFO Client: 
     client token: N/A
     diagnostics: User class threw exception: java.lang.NoClassDefFoundError: za/co/absa/abris/avro/read/confluent/SchemaManager$
    at scala.collection.immutable.List.foreach(List.scala:431)
    at scala.Function0.apply$mcV$sp(Function0.scala:39)
    at scala.Function0.apply$mcV$sp$(Function0.scala:39)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
    at scala.App.$anonfun$main$1$adapted(App.scala:80)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at scala.App.main(App.scala:80)
    at scala.App.main$(App.scala:78)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)
Caused by: java.lang.ClassNotFoundException: za.co.absa.abris.avro.read.confluent.SchemaManager$
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    ... 23 more
cerveada commented 1 year ago

I opened the jar from maven https://mvnrepository.com/artifact/za.co.absa/abris_2.12/6.3.0

za/co/absa/abris/avro/read/confluent % ls
ConfluentConstants$.class
ConfluentConstants.class
SchemaManager.class
SchemaManagerFactory$$anonfun$$nestedInanonfun$getOrCreateRegistryClient$1$1.class
SchemaManagerFactory$.class
SchemaManagerFactory.class

There is no SchemaManager$ class. Only SchemaManager. The $ is important here.

In Scala there can be class and object both having the same name. In Java this is not possible so when compiling the object names have the $ added at the end.

In Abris SchemaManager object was last present in version 3.2 https://github.com/AbsaOSS/ABRiS/blob/branch-3.2/src/main/scala/za/co/absa/abris/avro/read/confluent/SchemaManager.scala In newer version it was removed.

So it looks like your code is using the SchemaManager object somewhere and that is causing the issue.

Rap70r commented 1 year ago

Hi @cerveada,

Oh wow, I thought I saw it when I extracted the code. Hmm... My code does indeed uses that class. Not sure why the class was removed. However, when I check the path under master branch I do see it:

https://github.com/AbsaOSS/ABRiS/blob/master/src/main/scala/za/co/absa/abris/avro/read/confluent/SchemaManager.scala

Isn't that the same one? Or does it get excluded from the library?

Thank you

cerveada commented 1 year ago

As I was trying to explain in the previous message, there are two classes:

SchemaManager.class for SchemaManager scala class SchemaManager$.class for SchemaManager scala object

The object was removed in Abris 4.

when I check the path under master branch I do see it: https://github.com/AbsaOSS/ABRiS/blob/master/src/main/scala/za/co/absa/abris/avro/read/confluent/SchemaManager.scala

That is the scala class, but you can see that in that file there is no scala object defined.

This may explain better what object I am talking about: https://docs.scala-lang.org/overviews/scala-book/companion-objects.html