JuliaAI / MLJBase.jl

Core functionality for the MLJ machine learning framework
MIT License
160 stars 45 forks source link

Something not right with serialisation of python (sk-learn) models #435

Open ablaom opened 3 years ago

ablaom commented 3 years ago
using MLJ

model = @load RandomForestClassifier pkg=ScikitLearn

data = load_iris()
y, X = unpack(data, ==(:target), n->true, rng=1234)

mach = machine(model, X, y) |> fit!
MLJ.save("junk.jlso", mach)

## NEW REPL SESSION

using MLJ

model = @load RandomForestClassifier pkg=ScikitLearn

mach_predict_only = machine("junk.jlso") 

Sometimes the last expression hangs and if I interrupt Julia crashes. Sometimes it works fine. Same is true for JLSO.load("junk.jlso"). Anyone able to recreate?

(junk) pkg> st
Status `~/Dropbox/Julia7/MLJ/MLJ/sandbox/junk/Project.toml`
  [9da8a3cd] JLSO v2.3.2
  [add582a8] MLJ v0.14.1
  [5ae90465] MLJScikitLearnInterface v0.1.6

julia> versioninfo()
Julia Version 1.5.1
Commit 697e782ab8 (2020-08-25 20:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.5.0)
  CPU: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
  JULIA_LTS_PATH = /Applications/Julia-1.0.app/Contents/Resources/julia/bin/julia
  JULIA_PATH = /Applications/Julia-1.5.app/Contents/Resources/julia/bin/julia
  JULIA_NUM_THREADS = 5

It may be that the python objects bound to fitresult are not reliably persistent and we need to do implement custom MMI.save and MMI.restore along the lines of XGBoost.

Assuming ScikitLearn.jl exposes the python serialization, this ought to be easy to do.

@OkonSamuel Thoughts?

OkonSamuel commented 3 years ago

@ablaom Have you tried testing this with the previous implementation.

ablaom commented 3 years ago

Will do.

ablaom commented 3 years ago

It appears the issue is there already in the previous implementation (MLJBase@0.15.2). One call JLSO.load("junk.jslo") has been hanging over 30mins.

OkonSamuel commented 3 years ago

Am still trying to reproduce this issue on my linux PC

(@v1.5) pkg> st
Status `~/.julia/environments/v1.5/Project.toml`
  [6e4b80f9] BenchmarkTools v0.5.0
  [13f3f980] CairoMakie v0.3.3
  [a93c6f00] DataFrames v0.21.8
  [da1fdf0e] FreqTables v0.4.1
  [c91e804a] Gadfly v1.3.1
  [add582a8] MLJ v0.14.1
  [a7f614a8] MLJBase v0.15.3
  [d354fa79] MLJClusteringInterface v0.1.0 `~/Documents/MLJClusteringInterface.jl`
  [e80e1ace] MLJModelInterface v0.3.6
  [d491faf4] MLJModels v0.12.4
  [5ae90465] MLJScikitLearnInterface v0.1.6
  [ee78f7c6] Makie v0.11.1
  [6f286f6a] MultivariateStats v0.7.0
  [b98c9c47] Pipe v1.3.0
  [1a8c2f83] Query v1.0.0
  [ce6b1742] RDatasets v0.6.10
  [2913bbd2] StatsBase v0.33.2
  [fa267f1f] TOML v1.0.0
  [bd369af6] Tables v1.1.0
  [276b4fcb] WGLMakie v0.2.9

julia> versioninfo()
Julia Version 1.5.2
Commit 539f3ce943* (2020-09-23 23:17 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i3-5005U CPU @ 2.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, broadwell)

EDIT perhaps its a platform or julia version dependent issue? Although it takes roughly a minute to load or save machines

ablaom commented 3 years ago

My tests were run with 5 threads. If a I set the number of threads to one, then the problem seems to go away.