Closed rssdev10 closed 3 years ago
I'm afraid I cannot reproduce your problem:
julia> module Abc
import XGBoost: dump_model, save, Booster
using MLJ
using MLJBase
import MLJModels
using MLJModels.XGBoost_
function __init__()
@info "Abc"
end
end
[ Info: Recompiling stale cache file /Users/anthony/.julia/compiled/v1.1/XGBoost/rSeEh.ji for XGBoost [009559a3-9522-5dbb-924b-0b6ed2b22bb9]
[ Info: Abc
Main.Abc
julia> using MLJ
julia> task = load_boston()
SupervisedTask @ 5…85
julia> model = Abc.XGBoostRegressor()
MLJModels.XGBoost_.XGBoostRegressor(num_round = 1,
booster = "gbtree",
disable_default_eval_metric = 0,
eta = 0.3,
gamma = 0.0,
max_depth = 6,
min_child_weight = 1.0,
max_delta_step = 0.0,
subsample = 1.0,
colsample_bytree = 1.0,
colsample_bylevel = 1.0,
lambda = 1.0,
alpha = 0.0,
tree_method = "auto",
sketch_eps = 0.03,
scale_pos_weight = 1.0,
updater = "grow_colmaker",
refresh_leaf = 1,
process_type = "default",
grow_policy = "depthwise",
max_leaves = 0,
max_bin = 256,
predictor = "cpu_predictor",
sample_type = "uniform",
normalize_type = "tree",
rate_drop = 0.0,
one_drop = 0,
skip_drop = 0.0,
feature_selector = "cyclic",
top_k = 0,
tweedie_variance_power = 1.5,
objective = "reg:linear",
base_score = 0.5,
eval_metric = "rmse",
seed = 0,) @ 1…89
julia> mach = machine(model, task)
Machine{XGBoostRegressor} @ 1…99
julia> julia> evaluate!(mach)
┌ Info: Evaluating using cross-validation.
│ nfolds=6.
│ shuffle=false
│ measure=MLJ.rms
│ operation=StatsBase.predict
└ Resampling from all rows.
Cross-validating: 100%[=========================] Time: 0:00:01
6-element Array{Float64,1}:
15.071084701486205
16.70750413097405
22.12771143813795
20.89991496287021
15.434870166858115
11.602463981185641
Have you got MLJModels in your load path? You need MLJModels and MLJ in your project. Perhaps send me the result of ]status -m
or your Manifest.toml.
Please see attached package. abc.tar.gz Run ./build.jl from the file.
I'm afraid I cannot reproduce your problem:
It might be concurrency issue and be unstable. I can not say that I see it always. But in most cases it is present.
julia version 1.0.3. MacOS
(Abc) pkg> status -m
Project Abc v0.1.0
Status `~/projects/tmp/julia/Abc/Manifest.toml`
[7d9fca2a] Arpack v0.3.1
[9e28174c] BinDeps v0.8.10
[b99e7846] BinaryProvider v0.5.4
[336ed68f] CSV v0.5.5
[324d7699] CategoricalArrays v0.5.4
[34da2185] Compat v2.1.0
[a93c6f00] DataFrames v0.18.3
[864edb3b] DataStructures v0.15.0
[b4f34e82] Distances v0.8.0
[31c24e10] Distributions v0.20.0
[cd3eb016] HTTP v0.8.2
[83e8ac13] IniFile v0.5.0
[82899510] IteratorInterfaceExtensions v1.0.0
[682c06a0] JSON v0.20.0
[2d691ee1] LIBLINEAR v0.5.1
[b1bec4e5] LIBSVM v0.3.1
[add582a8] MLJ v0.2.3
[a7f614a8] MLJBase v0.2.2
[d491faf4] MLJModels v0.2.3
[739be429] MbedTLS v0.6.8
[e1d29d7a] Missings v0.4.1
[bac558e1] OrderedCollections v1.1.0
[90014a1f] PDMats v0.9.7
[69de0a69] Parsers v0.3.5
[2dfb63ee] PooledArrays v0.5.2
[92933f4c] ProgressMeter v1.0.0
[1fd47b50] QuadGK v2.0.4
[3cdcf5f2] RecipesBase v0.6.0
[189a3867] Reexport v0.2.0
[cbe49d4c] RemoteFiles v0.2.1
[ae029012] Requires v0.5.2
[79098fc4] Rmath v0.5.0
[6e75b9c4] ScikitLearnBase v0.4.1
[a2af1166] SortingAlgorithms v0.3.1
[276daf66] SpecialFunctions v0.7.2
[2913bbd2] StatsBase v0.30.0
[4c63d2b9] StatsFuns v0.8.0
[3783bdb8] TableTraits v1.0.0
[bd369af6] Tables v0.2.5
[30578b45] URIParser v0.4.0
[ea10d353] WeakRefStrings v0.6.1
[009559a3] XGBoost v0.3.1
[2a0f44e3] Base64
[ade2ca70] Dates
[8bb1440f] DelimitedFiles
[8ba89e20] Distributed
[9fa8497b] Future
[b77e0a4c] InteractiveUtils
[76f85450] LibGit2
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[d6f4376e] Markdown
[a63ad114] Mmap
[44cfe95a] Pkg
[de0858da] Printf
[9abbd945] Profile
[3fa0cd96] REPL
[9a3f8284] Random
[ea8e919c] SHA
[9e88b42a] Serialization
[1a1011a3] SharedArrays
[6462fe0b] Sockets
[2f01184e] SparseArrays
[10745b16] Statistics
[4607b0f0] SuiteSparse
[8dfed614] Test
[cf7118a7] UUIDs
[4ec0a83e] Unicode
Strange. I still can't reproduce your problem after activating the environment you sent:
(working) pkg> activate .
(Abc) pkg> instantiate
Updating registry at `~/.julia/registries/General`
Updating git-repo `https://github.com/JuliaRegistries/General.git`
julia> module Abc
import XGBoost: dump_model, save, Booster
using MLJ
using MLJBase
import MLJModels
using MLJModels.XGBoost_
function __init__()
@info "Abc"
end
end
Main.Abc
julia> using MLJ
julia> task = load_boston()
model = SupervisedTask{} @ 1…38
julia> model = Abc.XGBoostRegressor()
MLJModels.XGBoost_.XGBoostRegressor(num_round = 1,
booster = "gbtree",
disable_default_eval_metric = 0,
eta = 0.3,
gamma = 0.0,
max_depth = 6,
min_child_weight = 1.0,
max_delta_step = 0.0,
subsample = 1.0,
colsample_bytree = 1.0,
colsample_bylevel = 1.0,
lambda = 1.0,
alpha = 0.0,
tree_method = "auto",
sketch_eps = 0.03,
scale_pos_weight = 1.0,
updater = "grow_colmaker",
refresh_leaf = 1,
process_type = "default",
grow_policy = "depthwise",
max_leaves = 0,
max_bin = 256,
predictor = "cpu_predictor",
sample_type = "uniform",
normalize_type = "tree",
rate_drop = 0.0,
one_drop = 0,
skip_drop = 0.0,
feature_selector = "cyclic",
top_k = 0,
tweedie_variance_power = 1.5,
objective = "reg:linear",
base_score = 0.5,
eval_metric = "rmse",
seed = 0,) @ 5…98
julia> mach = machine(model, task)
Machine{XGBoostRegressor} @ 1…64
julia> evaluate!(mach)
┌ Info: Evaluating using cross-validation.
│ nfolds=6.
│ shuffle=false
│ measure=MLJ.rms
│ operation=StatsBase.predict
└ Resampling from all rows.
Cross-validating: 100%[=========================] Time: 0:00:02
6-element Array{Float64,1}:
15.071084701486205
16.70750413097405
22.12771143813795
20.89991496287021
15.434870166858115
11.602463981185641
julia> versioninfo()
Julia Version 1.0.3
Commit 099e826241 (2018-12-18 01:34 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin14.5.0)
CPU: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
JULIA_PATH = /Applications/Julia-1.1.app/Contents/Resources/julia/bin/julia
Run on MacOS.
Can you try to run it without REPL from command line with ./build.jl only? Again, I think something like concurrency issue is here.
Also I have a little bit older laptop:
julia> versioninfo()
Julia Version 1.0.3
Commit 099e826241 (2018-12-18 01:34 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin14.5.0)
CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, broadwell)
Yes, now I can reproduce your issue. Many thanks for this. I would say we have uncovered a limitation of Requires.jl. Do you not agree?
A secondary question is whether the @load
macro will work when called within a package, for models in packages with native MLJ interface implementations (ie, outside of MLJModels). In this case there would be no lazy loading. Unfortunately, no such package actually exists but we will have some soon (or could construct a Dummy package).
edit July 23, 2020: Can confirm that if interface is provided by a package without use of requires, then issue is not there.
Yes, I it might be restriction of Requires.jl. See also double call of __init__
as I mentioned in first message. But again, I almost sure that it is concurrency issue. I found that issue when had prepared the code for running as a web service.
So, some workaround we have. Regarding how to fix, as the issue confirmed, may be just put same issue with my sample to Requires.jl's list of issues if nobody can dive into it now.
Regarding loading of models, for now I'm using Booster(model_file = model_fn)
exactly for XGBoost.
Although I am doubtful, thought it worth mentioning that there was a refactor of @load that possibly resolve this issue. MLJModels 0.4.0 (which now owns the method) incorporates the changes.
Update: This issue is unresolved under MLJModels 0.5.0.
@ablaom is this still a (relevant) issue?
I believe it is still an issue. It seems one can't use MLJ to load models from within a package module. Some clues are provided above and in #321. I suspect (but have not confirmed) that this is a Requires issue. To reproduce be sure to follow the instructions of @rssdev10 exactly.
@ablaom @tlienart We're running into this issue as well
Noted. The long term plan is to "disintegrate" MLJModels into individual packages, eliminating all use of Requires.jl. Then loading a model with glue code currently provided by MLJModels, should be no different from loading models from packages that natively support the MLJ model interface (eg, EvoTrees.jl, MLJLinearModels.jl). In these cases, I am not aware of any issue, but let me know if you discover one.
Partial workaround is here: https://github.com/alan-turing-institute/MLJ.jl/issues/613#issuecomment-662784184
I think we better start the disintegration of MLJModels fast
PR's welcome 😄 Happy to provide guidance. The repos are called `MLJGLMInterface.jl, and so forth). If you want to start on one, let me know which and I'll get you commit access.
Here's the issue: https://github.com/alan-turing-institute/MLJModels.jl/issues/244#issuecomment-641668554
Great. I will work on them in my spare time.
Pretty sure this has been resolved by above PR.
I'm trying to make a module with MLJModel:
but having an error
ERROR: LoadError: UndefVarError: XGBoost_ not defined
.Looks like there is an issue with lazy activation in
One workaround I found is
Also I added debug output into function
__init__
of the module MLJModels and I see that this method is called twice. I have something like:May be it is related to a chain of
__init__
methods.