dfdx / Spark.jl

Julia binding for Apache Spark
Other
205 stars 39 forks source link

InitError: JavaCall.JavaCallError("Class Not Found org/apache/log4j/Level") #118

Open ai-ml-with-kapil opened 1 year ago

ai-ml-with-kapil commented 1 year ago

I am facing issue :

julia> if Sys.isunix() ENV["JULIA_COPY_STACKS"] = 1 end 1

julia> using JavaCall

julia> using Spark ERROR: InitError: JavaCall.JavaCallError("Class Not Found org/apache/log4j/Level") Stacktrace: [1] _metaclass(class::Symbol) @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:383 [2] metaclass(class::Symbol) @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:389 [3] jfield(typ::Type{JavaObject{Symbol("org.apache.log4j.Level")}}, field::String, fieldType::Type) @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:263 [4] set_log_level(log_level::String) @ Spark ~/.julia/packages/Spark/89BUd/src/init.jl:8 [5] init(; log_level::String) @ Spark ~/.julia/packages/Spark/89BUd/src/init.jl:60 [6] init @ ~/.julia/packages/Spark/89BUd/src/init.jl:16 [inlined] [7] init() @ Spark ~/.julia/packages/Spark/89BUd/src/core.jl:30 [8] _include_from_serialized(pkg::Base.PkgId, path::String, depmods::Vector{Any}) @ Base ./loading.jl:831 [9] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt64) @ Base ./loading.jl:1039 [10] _require(pkg::Base.PkgId) @ Base ./loading.jl:1315 [11] _require_prelocked(uuidkey::Base.PkgId) @ Base ./loading.jl:1200 [12] macro expansion @ ./loading.jl:1180 [inlined] [13] macro expansion @ ./lock.jl:223 [inlined] [14] require(into::Module, mod::Symbol) @ Base ./loading.jl:1144 during initialization of module Spark

can someone please help me?

djliden commented 1 year ago

I also encountered this. MacOS 13.5. I tried it with Julia versions 1.8 and 1.9.

dfdx commented 1 year ago

Could you please re-build Spark.jl and post the log here?

] build Spark
djliden commented 1 year ago

Thanks for looking! Here is the log.

build.log

dfdx commented 1 year ago

I don't see anything strange in your log and can't reproduce it, so let's start with a couple of tests.

Check if JavaCall works fine:

# Julia REPL
using JavaCall

JHashMap = @jimport java.util.HashMap
jmap = JHashMap(())
listmethods(jmap, "put")
jcall(jmap, "put", JObject, (JObject, JObject), "foo", "text value")

Check the versions and env vars:

# bash
java -version
mvn -version
echo $SPARK_HOME

Check that Spark.jl generated all the needed artifacts:

# bash
# you may need to install tree utility or anything to print the contents of the directory
tree ~/.julia/packages/Spark

# alternatively
ls ~/.julia/packages/Spark/*/jvm/sparkjl/target
djliden commented 1 year ago

Looks like something with JavaCall...I'll see what I can do in tracking down the issue from that direction.

julia> # Julia REPL
       using JavaCall

julia> JHashMap = @jimport java.util.HashMap
JavaObject{Symbol("java.util.HashMap")}

julia> jmap = JHashMap(())
ERROR: JavaCall.JavaCallError("JVM not initialised. Please run init()")
Stacktrace:
 [1] assertloaded
   @ ~/.julia/packages/JavaCall/MlduK/src/jvm.jl:241 [inlined]
 [2] jnew(::Symbol, ::Tuple{})
   @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:211
 [3] JavaObject{Symbol("java.util.HashMap")}(::Tuple{})
   @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:103
 [4] top-level scope
   @ REPL[3]:1

julia> listmethods(jmap, "put")
ERROR: UndefVarError: `jmap` not defined
Stacktrace:
 [1] top-level scope
   @ REPL[4]:1

julia> JavaCall.init()

julia> JHashMap = @jimport java.util.HashMap
JavaObject{Symbol("java.util.HashMap")}

julia> jmap = JHashMap(())
JavaObject{Symbol("java.util.HashMap")}(JavaCall.JavaLocalRef(Ptr{Nothing} @0x000000012312adc0))

julia> listmethods(jmap, "put")
1-element Vector{JMethod}:
Exception in thread "main" Error showing value of type Vector{JMethod}:
ERROR: JavaCall.JavaCallError("Java Exception thrown, but no details could be retrieved from the JVM")
Stacktrace:
  [1] geterror(allow::Bool)
    @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:411
  [2] geterror
    @ ~/.julia/packages/JavaCall/MlduK/src/core.jl:403 [inlined]
  [3] _jcall(::JMethod, ::Ptr{Nothing}, ::Ptr{Nothing}, ::Type, ::Tuple{})
    @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:373
  [4] jcall(::JMethod, ::String, ::Type, ::Tuple{})
    @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/core.jl:245
  [5] getname
    @ ~/.julia/packages/JavaCall/MlduK/src/reflect.jl:62 [inlined]
  [6] show(io::IOContext{IOBuffer}, method::JMethod)
    @ JavaCall ~/.julia/packages/JavaCall/MlduK/src/reflect.jl:170
  [7] sprint(f::Function, args::JMethod; context::IOContext{Base.TTY}, sizehint::Int64)
    @ Base ./strings/io.jl:112
  [8] sprint
    @ ./strings/io.jl:107 [inlined]
  [9] alignment_from_show
    @ ./show.jl:2817 [inlined]
 [10] alignment(io::IOContext{Base.TTY}, x::JMethod)
    @ Base ./show.jl:2836
 [11] alignment(io::IOContext{Base.TTY}, X::AbstractVecOrMat, rows::Vector{Int64}, cols::Vector{Int64}, cols_if_complete::Int64, cols_otherwise::Int64, sep::Int64, ncols::Int64)
    @ Base ./arrayshow.jl:69
 [12] _print_matrix(io::IOContext{Base.TTY}, X::AbstractVecOrMat, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64, rowsA::UnitRange{Int64}, colsA::UnitRange{Int64})
    @ Base ./arrayshow.jl:207
 [13] print_matrix(io::IOContext{Base.TTY}, X::Vector{JMethod}, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64)
    @ Base ./arrayshow.jl:171
 [14] print_matrix
    @ ./arrayshow.jl:171 [inlined]
 [15] print_array
    @ ./arrayshow.jl:358 [inlined]
 [16] show(io::IOContext{Base.TTY}, #unused#::MIME{Symbol("text/plain")}, X::Vector{JMethod})
    @ Base ./arrayshow.jl:399
 [17] (::REPL.var"#55#56"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:276
 [18] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:557
 [19] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:262
 [20] display
    @ /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:281 [inlined]
 [21] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:340
 [22] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:0
 [23] (::REPL.var"#57#58"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:287
 [24] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:557
 [25] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:285
 [26] (::REPL.var"#do_respond#80"{Bool, Bool, REPL.var"#93#103"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:899
 [27] #invokelatest#2
    @ ./essentials.jl:816 [inlined]
 [28] invokelatest
    @ ./essentials.jl:813 [inlined]
 [29] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/LineEdit.jl:2647
 [30] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL /Applications/Julia-1.9.app/Contents/Resources/julia/share/julia/stdlib/v1.9/REPL/src/REPL.jl:1300
 [31] (::REPL.var"#62#68"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ./task.jl:514

versions:

shell> java -version
java version "11.0.19" 2023-04-18 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.19+9-LTS-224)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.19+9-LTS-224, mixed mode)

shell> mvn -version
Apache Maven 3.9.4 (dfbb324ad4a7c8fb0bf182e6d91b0ae20e3d2dd9)
Maven home: /opt/homebrew/Cellar/maven/3.9.4/libexec
Java version: 11.0.19, vendor: Oracle Corporation, runtime: /Library/Java/JavaVirtualMachines/jdk-11.jdk/Contents/Home
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "13.5", arch: "aarch64", family: "mac"
djliden commented 1 year ago

I have a bit more testing to do but at least part of the issue appears to relate back to this warning in the JavaCall docs:

Setting JULIA_COPY_STACKS=yes in startup.jl will not work. It must be set before Julia starts. On *nix based systems, this can be done from the shell by using $ JULIA_COPY_STACKS=yes julia from a shell.

https://juliainterop.github.io/JavaCall.jl/

Now I'm hitting:

julia> spark = SparkSession.builder.appName("Main").master("local").getOrCreate()

[45697] signal (11.2): Segmentation fault: 11
in expression starting at REPL[3]:1
unknown function (ip: 0x28002f3ec)
Allocations: 4487659 (Pool: 4482903; Big: 4756); GC: 6
[1]    45697 segmentation fault  JULIA_COPY_STACKS=yes julia

Which seems unrelated to the original issue. So I still have some things to figure out but perhaps the original issue can be deemed resolved by setting JULIA_COPY_STACKS=yes before startup.jl

dfdx commented 1 year ago

Regarding segmentation fault, can you try a different combination of Julia and JDK? IIRC, we had a similar issue with Julia 1.2-1.5, and also a few issues with OpenJDK. Currently, Julia 1.9.2 and OpenJDK 11.0.19 work for me on Ubuntu 20.04, but I don't have MacOS to test this setup there.

djliden commented 1 year ago

I'll dig into that further a little later this week; thank you for the pointers.

djliden commented 1 year ago

Apologies for the slow response. I believe everything is working for me now. I'm not sure which exact combination of fixes got everything working; I hope to start over later and come up with a reproducible guide. But for reference, the major changes were:

At this point I have:

$ java -version              
openjdk version "11.0.20" 2023-07-18
OpenJDK Runtime Environment Homebrew (build 11.0.20+0)
OpenJDK 64-Bit Server VM Homebrew (build 11.0.20+0, mixed mode)
$ mvn -version
Apache Maven 3.9.4 (dfbb324ad4a7c8fb0bf182e6d91b0ae20e3d2dd9)
Maven home: /opt/homebrew/Cellar/maven/3.9.4/libexec
Java version: 11.0.20, vendor: Homebrew, runtime: /opt/homebrew/Cellar/openjdk@11/11.0.20/libexec/openjdk.jdk/Contents/Home
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "13.5", arch: "aarch64", family: "mac"

Then I was able to do

$ JULIA_COPY_STACKS=yes julia

julia> using Spark
julia> spark = SparkSession.builder.appName("Main").master("local").getOrCreate()
julia> df = spark.createDataFrame([["Alice", 19], ["Bob", 23]], "name string, age long")
+-----+---+
| name|age|
+-----+---+
|Alice| 19|
|  Bob| 23|
+-----+---+
pawankukreja01 commented 1 year ago

The error message suggests that the JavaCall package is unable to find the org/apache/log4j/Level class. This error is usually caused by a missing dependency or a version mismatch between the packages.

Here are some steps you can try to resolve this issue:

Check if you have installed the required dependencies for the JavaCall and Spark packages. You can do this by running the following command in your Julia REPL:

Julia This code is AI-generated. Review and use carefully. JavaCall Spark

This will show you the list of installed packages and their versions. Make sure that all the required dependencies are installed and up-to-date.

Try updating the JavaCall package to the latest version by running the following command in your Julia REPL:

Julia This code is AI-generated. Review and use carefully. up JavaCall

If updating the package doesn’t work, try uninstalling and reinstalling the JavaCall package by running the following commands in your Julia REPL:

Julia This code is AI-generated. Review and use carefully rm JavaCall add JavaCall