JuliaML / LIBSVM.jl

LIBSVM bindings for Julia
Other
88 stars 35 forks source link

crash with Julia 1.8.4 under Windows 11 #95

Closed mlesnoff closed 1 year ago

mlesnoff commented 1 year ago

I am using LIBSVM.jl v0.8.0, under Windows 11.

It works fine with Julia 1.8.3:

julia> versioninfo() Julia Version 1.8.3 Commit 0434deb161 (2022-11-14 20:14 UTC) Platform Info: OS: Windows (x86_64-w64-mingw32) CPU: 16 × Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-13.0.1 (ORCJIT, skylake) Threads: 8 on 16 virtual cores Environment: JULIA_EDITOR = code JULIA_NUM_THREADS = 8

But when I uses Julia 1.8.4:

julia> versioninfo() Julia Version 1.8.4 Commit 00177ebc4f (2022-12-23 21:32 UTC) Platform Info: OS: Windows (x86_64-w64-mingw32) CPU: 16 × Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-13.0.1 (ORCJIT, skylake) Threads: 8 on 16 virtual cores Environment: JULIA_EDITOR = code JULIA_NUM_THREADS = 8`

running any function kills my Julia session, for instance after doing:

using LIBSVM (X, y) = (randn(100,4), randn(100))

and then the command below;

svmtrain(X', y)

kills the process and closes directly Julia.

Did somebody observe the same problem and know what is happening?

barucden commented 1 year ago

Do you see any error?

mlesnoff commented 1 year ago

no. When I use directlty Julia (= without IDE), there is no error printed, just the Julia api closes and disappears. When I use VsCode, the TERMINAL closes (without printing an error), and there is a VScode alert saying: "The terminal process terminated with exit code: 3221226356".

I observe the same problem with package XGBoost.jl. Both packages are wrappers for external libraries; may be this is linked.

barucden commented 1 year ago

Okay. The exit code should mean heap corruption and since I have no idea how to debug that... can you try running the following code in a fresh Julia session? It is basically the body of svmtrain. Maybe we will be able to figure out which line causes the crash.

using LIBSVM
(X, y) = (randn(100,4), randn(100))
X = X'
LIBSVM.set_num_threads(1)
degree = 3
_svmtype = 0
_kernel = Int32(Kernel.RadialBasis)
LIBSVM.check_train_input(X, y, Kernel.RadialBasis)
idx, reverse_labels, weights, weight_labels = LIBSVM.indices_and_weights(y, X, nothing)
param = LIBSVM.SVMParameter(
        _svmtype, _kernel, Int32(3), Float64(1.0 / size(X, 1)),
        0.0, 200.0, 0.001, 1.0, Int32(length(weights)),
        pointer(weight_labels), pointer(weights), 0.5, 0.1, Int32(true),
        Int32(false))
ninstances = size(X, 2)
nodes, nodeptrs =  LIBSVM.instances2nodes(X)
problem = LIBSVM.SVMProblem(Int32(ninstances), pointer(idx), pointer(nodeptrs))
LIBSVM.libsvm_set_verbose(true)
@GC.preserve nodes begin
    LIBSVM.libsvm_check_parameter(problem, param)
    ptr_model = LIBSVM.libsvm_train(problem, param)
end
svm = LIBSVM.SVM(unsafe_load(ptr_model), y, X, nothing, reverse_labels, SVC, Kernel.RadialBasis)
LIBSVM.libsvm_free_model(ptr_model)
mlesnoff commented 1 year ago

Thanks, I did run it:

julia> using LIBSVM

julia> (X, y) = (randn(100,4), randn(100))
([1.1571690985215213 -1.4655808933016536 -0.5977168643349797 0.23670932033058822; 1.045503287624356 1.522046538174527 0.4086712588952419 -0.9324406722062056; … ; 0.015344693271484194 -1.2274585735423154 -1.0148089998062437 0.1819779353324507; 1.164254578419492 -0.4848194088689624 0.8331051401309986 -1.1937463750802504], [-0.27092475097381574, -2.7869592919398616, 0.39722507000723495, -0.6144566882437535, -0.508227597172905, -0.04517546023972688, 0.3874286075603758, 0.4228250301983162, -0.06756199848523661, 0.8778665954489996  …  1.008174872315394, -0.3929995471881523, 0.7032768374006367, 0.38025931107838473, -1.7270008907829422, 0.22935366733471255, 0.17882862687997875, 0.44289178845636096, -1.3425301258045377, 0.4706486988581435])

julia> X = X'
4×100 adjoint(::Matrix{Float64}) with eltype Float64:
  1.15717    1.0455    -1.70691   -0.938373  -1.29473   …   1.65808    0.0553481   1.56421    0.0153447   1.16425
 -1.46558    1.52205    0.189132   1.12947    2.11767      -2.15229    0.746509    1.48085   -1.22746    -0.484819
 -0.597717   0.408671   1.82956    0.707923   1.52135       2.06081   -0.726431    0.278311  -1.01481     0.833105
  0.236709  -0.932441   0.569482  -0.361493   0.128428     -0.243661   0.88279    -0.243693   0.181978   -1.19375

julia> LIBSVM.set_num_threads(1)

julia> degree = 3
3

julia> _svmtype = 0
0

julia> _kernel = Int32(Kernel.RadialBasis)
2

julia> LIBSVM.check_train_input(X, y, Kernel.RadialBasis)

julia> idx, reverse_labels, weights, weight_labels = LIBSVM.indices_and_weights(y, X, nothing)
([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0  …  91.0, 92.0, 93.0, 94.0, 95.0, 96.0, 97.0, 98.0, 99.0, 100.0], [-0.27092475097381574, -2.7869592919398616, 0.39722507000723495, -0.6144566882437535, -0.508227597172905, -0.04517546023972688, 0.3874286075603758, 0.4228250301983162, -0.06756199848523661, 0.8778665954489996  …  1.008174872315394, -0.3929995471881523, 0.7032768374006367, 0.38025931107838473, -1.7270008907829422, 0.22935366733471255, 0.17882862687997875, 0.44289178845636096, -1.3425301258045377, 0.4706486988581435], Float64[], Int32[])

julia> param = LIBSVM.SVMParameter(
               _svmtype, _kernel, Int32(3), Float64(1.0 / size(X, 1)),
               0.0, 200.0, 0.001, 1.0, Int32(length(weights)),
               pointer(weight_labels), pointer(weights), 0.5, 0.1, Int32(true),
               Int32(false))
LIBSVM.SVMParameter(0, 2, 3, 0.25, 0.0, 200.0, 0.001, 1.0, 0, Ptr{Int32} @0x000001bf0562a440, Ptr{Float64} @0x000001bf0562a480, 0.5, 0.1, 1, 0)

julia> ninstances = size(X, 2)
100

julia> nodes, nodeptrs =  LIBSVM.instances2nodes(X)
(LIBSVM.SVMNode[LIBSVM.SVMNode(1, 1.1571690985215213) LIBSVM.SVMNode(1, 1.045503287624356) … LIBSVM.SVMNode(1, 0.015344693271484194) LIBSVM.SVMNode(1, 1.164254578419492); LIBSVM.SVMNode(2, -1.4655808933016536) LIBSVM.SVMNode(2, 1.522046538174527) … LIBSVM.SVMNode(2, -1.2274585735423154) LIBSVM.SVMNode(2, -0.4848194088689624); … ; LIBSVM.SVMNode(4, 0.23670932033058822) LIBSVM.SVMNode(4, -0.9324406722062056) … LIBSVM.SVMNode(4, 0.1819779353324507) LIBSVM.SVMNode(4, -1.1937463750802504); LIBSVM.SVMNode(-1, NaN) LIBSVM.SVMNode(-1, NaN) … LIBSVM.SVMNode(-1, NaN) LIBSVM.SVMNode(-1, NaN)], Ptr{LIBSVM.SVMNode}[Ptr{LIBSVM.SVMNode} @0x000001befa9c9080, Ptr{LIBSVM.SVMNode} @0x000001befa9c90d0, Ptr{LIBSVM.SVMNode} @0x000001befa9c9120, Ptr{LIBSVM.SVMNode} @0x000001befa9c9170, Ptr{LIBSVM.SVMNode} @0x000001befa9c91c0, Ptr{LIBSVM.SVMNode} @0x000001befa9c9210, Ptr{LIBSVM.SVMNode} @0x000001befa9c9260, Ptr{LIBSVM.SVMNode} @0x000001befa9c92b0, Ptr{LIBSVM.SVMNode} @0x000001befa9c9300, Ptr{LIBSVM.SVMNode} @0x000001befa9c9350  …  Ptr{LIBSVM.SVMNode} @0x000001befa9caca0, Ptr{LIBSVM.SVMNode} @0x000001befa9cacf0, Ptr{LIBSVM.SVMNode} @0x000001befa9cad40, Ptr{LIBSVM.SVMNode} @0x000001befa9cad90, Ptr{LIBSVM.SVMNode} @0x000001befa9cade0, Ptr{LIBSVM.SVMNode} @0x000001befa9cae30, Ptr{LIBSVM.SVMNode} @0x000001befa9cae80, Ptr{LIBSVM.SVMNode} @0x000001befa9caed0, Ptr{LIBSVM.SVMNode} @0x000001befa9caf20, Ptr{LIBSVM.SVMNode} @0x000001befa9caf70])

julia> problem = LIBSVM.SVMProblem(Int32(ninstances), pointer(idx), pointer(nodeptrs))
LIBSVM.SVMProblem(100, Ptr{Float64} @0x000001bef986c740, Ptr{Ptr{LIBSVM.SVMNode}} @0x000001bf06487140)

julia> LIBSVM.libsvm_set_verbose(true)

Then this command below:

ptr_model = LIBSVM.libsvm_train(problem, param)

kills Julia (with no printed info). What I don't understand is that LIVSIM.jl works fine with my Julia 1.8.3 (same for XGBoost.jl).

barucden commented 1 year ago

I guess you could try removing the precompile cache at ~/.julia/compiled/ (although I don't think that precompiled packages are shared between julia versions so it probably won't help).

Apart from that, I don't know how to proceed. If nobody else responds here, I suggest you to create a thread at Discourse.

mlesnoff commented 1 year ago

Removing the cache did not work. Actually, I already created a thread at Discourse, but received no answer. This is why I created this issue. May be I will create one in XGBoost.jl Thanks for your helps above.

mlesnoff commented 1 year ago

Under Windows11, both packages XGboost.jl or LIBSVM.j do not work from 1.8.4, even with v1.9.0-beta3. Actually, to use these packages under Windows, I don't see other solution than staying on Julia 1.8.3

barucden commented 1 year ago

I see in the Discourse thread that the problem was already bisected, and it is mentioned in the XGBoost.jl issue that the problem is being investigated on the Julia's side.

I don't think we can do much on the side of LIBSVM.jl. Let's just keep the issue open here so that other users who potentially observe the same problem know it has been reported. We can close it once it is resolved in Julia.

mlesnoff commented 1 year ago

I see in the Discourse thread that the problem was already bisected, and it is mentioned in the XGBoost.jl issue that the problem is being investigated on the Julia's side.

FYI it seems that the issue is not exactly on the Julia side, as explained by @mkitti here

barucden commented 1 year ago

LIBSVM also seems to use OpenMP so @mkitti is probably right.

Still, as I understand it, crashing svm_train is just a symptom, and the root cause is not within LIBSVM(.jl).

aminadibi commented 1 year ago

Same issue with Julia 1.9.0

giordano commented 1 year ago

Fixed in Julia master by https://github.com/JuliaLang/julia/pull/50135

barucden commented 1 year ago

Thanks @giordano! Can anyone with a Windows machine download a nightly version and verify that the issue is gone so we can close it?

mkitti commented 1 year ago

I built his branch. The tests pass.

Test Summary: | Pass  Total   Time
LibSVM        |   56     56  30.3s
     Testing LIBSVM tests passed

julia> versioninfo()
Julia Version 1.10.0-DEV.1468
Commit a523212dd8* (2023-06-11 18:17 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 48 × Intel(R) Xeon(R) Gold 5220R CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, cascadelake)
  Threads: 1 on 96 virtual cores
barucden commented 1 year ago

Great! Thank you, @mkitti, for the verification. Let's close this issue then.

For anyone affected: the fix will be part of Julia 1.10 (released in several months). The fix is also scheduled for back-porting to 1.9 so it should be present in the future version 1.9.2 (released in several weeks). In the meantime, consider using the nightly version of Julia, which already contains the fix.