JuliaAI / MLJ.jl

A Julia machine learning framework
https://juliaai.github.io/MLJ.jl/
Other
1.78k stars 156 forks source link

Unexpected loading behavior #757

Open azev77 opened 3 years ago

azev77 commented 3 years ago

I installed a fresh version of Julia 1.6 on my macbook. 1 installed the latest MLJ 1.6 then I tried to auto load all MLJ relevant packages

@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
        T = @eval @load $(model.name) pkg=$(model.package_name) add=true # verbosity=0
        push!(model_types, T)
    end
    model_instances = map(T -> T(), model_types)
end
load_m(models()) # Load all models ONCE!

Now my MLJ version is 15.2

why am I getting downgraded?

azev77 commented 3 years ago

I think I found the culprit


julia> Pkg.add("NaiveBayes")
   Resolving package versions...
   Installed GPUCompiler ─ v0.9.2
   Installed CUDA ──────── v2.6.0
    Updating `~/.julia/environments/v1.6/Project.toml`
  [f6006082] ↓ EvoTrees v0.7.0 ⇒ v0.5.3
  [add582a8] ↓ MLJ v0.16.0 ⇒ v0.15.2
  [d491faf4] ↓ MLJModels v0.14.1 ⇒ v0.13.3
  [9bbee03b] + NaiveBayes v0.5.0
ablaom commented 3 years ago

Good work identifying.

I'll investigate.

ablaom commented 3 years ago

This should resolve the issue:

https://github.com/dfdx/NaiveBayes.jl/pull/51

ablaom commented 3 years ago

I can locally get all three:

(jl_QLSWNr) pkg> st
Status `/private/var/folders/4n/gvbmlhdc8xj973001s6vdyw00000gq/T/jl_QLSWNr/Project.toml`
  [f6006082] EvoTrees v0.7.0
  [add582a8] MLJ v0.16.0
  [9bbee03b] NaiveBayes v0.5.0 `../../../../../../../Users/anthony/Dropbox/Julia7/NaiveBayes`
azev77 commented 3 years ago

But do you know why installing a new pkg downgraded MLJ automatically & can MLJ make this not happen w/o consent?

ablaom commented 3 years ago

Isn't this is essentially a pkg manager issue, not an MLJ issue?

If you want fix the MLJ version, then do ]pin MLJ or Pkg.pin("MLJ"). Then pkg manager will throw a conflict error if adding a new package pushes MLJ down.

ablaom commented 3 years ago

(pin is also an indespensible tool for debugging pkg conflicts)

azev77 commented 3 years ago

You're prob correct, I wanted to make sure there is no MLJ setting that is behind this.

These kinds of things can create a lot of issues for newcomers esp. Do you think it's worth adding this (auto load models) to Tests?

ablaom commented 3 years ago

We essentially have a test like this: MLJModels.check_registry(). See https://github.com/alan-turing-institute/MLJModels.jl/blob/dev/src/registry/src/check_registry.jl .

Although, we only run it after registry updates, not in CI.

ablaom commented 3 years ago

These kinds of things can create a lot of issues for newcomers esp.

Agreed! Pkg management has been a huge stumbling block for me.

Maybe there is a good tutorial introducing package management to newcomers, I don't know. But I sure wish there had been one when I started out. 😢

instantatiate -p solves a lot of early pitfalls

I think the julia developers are aware of the importance of making pkg management more user friendly. And the status quo is better than the conda/pip/virualenv mess. I don't think it's easy to have a pkg management system without also having some kind of user training. I'm no expert, but I understand the real source of Dependency Hell is that fact that the package dependency problem is NP-hard which surprised me when I first encountered it. I think that's part of the problem. We intuitively feel pkg management ought to be something that can be fully automated, but that's hard to do.

azev77 commented 3 years ago

Would it be a big hassle to run it in CI?

ablaom commented 3 years ago

Well, this requires pre-compiling and loading the entire stack of model-providing packages, which takes a while, as you know.

I suppose we could add it to the dev->master PR's only (but not part of runtests). Would you see that as sufficient?

That is, for such PR's to run

using MLJModels
MLJModels.check_registry()

and throw a build fail if an exception is thrown.

@DilumAluthge Is this something you could help with? On MLJModels, not MLJ.

azev77 commented 3 years ago

That would be great!

DilumAluthge commented 3 years ago

Yeah shouldn't be too hard.

DilumAluthge commented 3 years ago

I'll work on it here: https://github.com/alan-turing-institute/MLJModels.jl/pull/369

azev77 commented 3 years ago

Just tried loading all on my work computer. Still have same problem w/ NaiveBayes

ablaom commented 3 years ago

It seems we are still waiting for a patch release from NaiveBayes: https://github.com/dfdx/NaiveBayes.jl/pull/51#issuecomment-809846249

BTW @DilumAluthge has now added check_registry to the MLJModels CI and it seems to be working.

azev77 commented 3 years ago

It works now, but there are some instabilities. The first time I load all packages I get an error:

@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
        if (
            1 == 1
            #model.package_name != "MLJNaiveBayesInterface" 
            #&& 
            #model.package_name !=  "NaiveBayes"
            )
            println(model.name, " ", model.package_name)
            try
                T = @eval @load $(model.name) pkg=$(model.package_name) add = true #add=true # verbosity=0
                push!(model_types, T)    
            catch e
                return e
            end 
        end     
    end
    model_instances = map(T -> T(), model_types)
end
models() |> load_m # Load all models ONCE!

#############
...

XGBoostClassifier XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
XGBoostCount XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
XGBoostRegressor XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
ERROR: MethodError: no method matching MLJScikitLearnInterface.ARDRegressor()
The applicable method may be too new: running in world age 29720, while current world is 30272.
Closest candidates are:
  MLJScikitLearnInterface.ARDRegressor(; n_iter, tol, alpha_1, alpha_2, lambda_1, lambda_2, compute_score, threshold_lambda, fit_intercept, normalize, copy_X, verbose) at none:0 (method too new to be called from this world context.)
Stacktrace:
 [1] (::var"#5#6")(T::Type{MLJScikitLearnInterface.ARDRegressor})
   @ Main .\REPL[2]:19
 [2] iterate
   @ .\generator.jl:47 [inlined]
 [3] _collect(c::Vector{Any}, itr::Base.Generator{Vector{Any}, var"#5#6"}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
   @ Base .\array.jl:691
 [4] collect_similar
   @ .\array.jl:606 [inlined]
 [5] map
   @ .\abstractarray.jl:2294 [inlined]
 [6] load_m
   @ .\REPL[2]:19 [inlined]
 [7] |>(x::Vector{NamedTuple{(:name, :package_name, :is_supervised, :docstring, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :is_pure_julia, :is_wrapper, :load_path, :package_license, :package_url, :package_uuid, :prediction_type, :supports_online, :supports_weights, :input_scitype, :target_scitype, :output_scitype), T} where T<:Tuple}, f::typeof(load_m))
   @ Base .\operators.jl:858
 [8] top-level scope
   @ REPL[3]:1

julia> 

But it works when I run it a second time.

ablaom commented 3 years ago

I can also get this behaviour but, in addition I get occasional complaints (first time around) like this one:

EvoTreeClassifier EvoTrees
[ Info: For silent loading, specify `verbosity=0`. 
[ Info: Precompiling EvoTrees [f6006082-12f8-11e9-0c9c-0d5d367ab1e5]
ERROR: LoadError: ArgumentError: Package CUDA [052768ef-5323-5732-b1bb-66c8b64840ba] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

Stacktrace:  [1] _require(pkg::Base.PkgId)
   @ Base ./loading.jl:990
 [2] require(uuidkey::Base.PkgId)
   @ Base ./loading.jl:914
 [3] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:901
 [4] include
   @ ./Base.jl:386 [inlined]
 [5] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, 
dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
   @ Base ./loading.jl:1213
 [6] top-level scope
   @ none:1
 [7] eval
   @ ./boot.jl:360 [inlined]
 [8] eval(x::Expr)
   @ Base.MainInclude ./client.jl:446
 [9] top-level scope
   @ none:1
in expression starting at /Users/anthony/Dropbox/Julia7/EvoTrees/src/EvoTrees.jl:1

which don't understand, as CUDA is in the Project.toml file of the EvoTrees.jl package. However, it seems CUDA (and the EvoTrees.jl models) are nevertheless installed anyway.

Once the complete environment is built (which happens by the end of the first call) there does not appear to be any problems. (So your issue has a workaround - just start with a pre-populated environment.) But there is something funny going on when building the environment as you go.

@giordano Do you have any ideas?

azev77 commented 3 years ago

Just tried again. Got the same error message the 1st time, not 2nd time. Same as before...

ablaom commented 3 years ago

Appreciate that. Unfortunately, I don't have a lot of bandwidth to spare for this issue just now. We may need some help but this will be easier if we can isolate this a little better. For example, can you reproduce the error with, say, dropping the loop and just calling using model = models()[1]?

giordano commented 3 years ago

I tried to reproduce the issue in an environment with only MLJ.jl installed with this code:

using MLJ
@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
        if (
            1 == 1
            #model.package_name != "MLJNaiveBayesInterface" 
            #&& 
            #model.package_name !=  "NaiveBayes"
            )
            println(model.name, " ", model.package_name)
            try
                T = @eval @load $(model.name) pkg=$(model.package_name) add = true #add=true # verbosity=0
                push!(model_types, T)    
            catch e
                return e
            end 
        end     
    end
    model_instances = map(T -> T(), model_types)
end
models() |> load_m # Load all models ONCE!

But I don't get your error on EvoTrees.jl. I believe the overall issue is that you're trying to smash together some hundreds of packages in the same environment and you're set to get into version incompatibilities. I'm not sure MLJ.jl is related to this though.

azev77 commented 3 years ago

The following compromise seems to work for my blog post. I can add the latest versions of all packages except: Conda/PyCall/DataFrames/MLJBase. I add the immediate previous version for those four packages.

import Pkg;
Pkg.activate(mktempdir())
Pkg.add([
    Pkg.PackageSpec(name="Conda", version="1.5.1"), #1.5.2
    Pkg.PackageSpec(name="PyCall", version="1.92.2"), #1.92.3
    Pkg.PackageSpec(name="MLJ", version="0.16.2"),
    Pkg.PackageSpec(name="MLJScikitLearnInterface", version="0.1.9"), #0.1.9
    Pkg.PackageSpec(name="RDatasets", version="0.7.5"),
    Pkg.PackageSpec(name="DataFrames", version="0.22.7"), #1.0.1  #0.22.7
    Pkg.PackageSpec(name="MLJDecisionTreeInterface", version="0.1.2"),
    Pkg.PackageSpec(name="MLJMultivariateStatsInterface", version="0.2.2"),
    Pkg.PackageSpec(name="MLJModels", version="0.14.4"),
    Pkg.PackageSpec(name="BetaML", version="0.5.0"),
    Pkg.PackageSpec(name="MLJBase", version="0.18.1"), #0.18.2
    Pkg.PackageSpec(name="MLJLinearModels", version="0.5.4"), 
    Pkg.PackageSpec(name="MLJLIBSVMInterface", version="0.1.3"), 
    Pkg.PackageSpec(name="EvoTrees", version="0.7.5"),
    Pkg.PackageSpec(name="MLJNaiveBayesInterface", version="0.1.3"),
    Pkg.PackageSpec(name="MLJXGBoostInterface", version="0.1.3"),
    Pkg.PackageSpec(name="NearestNeighborModels", version="0.1.5"), 
    Pkg.PackageSpec(name="ParallelKMeans", version="0.2.2"), 
    Pkg.PackageSpec(name="PartialLeastSquaresRegressor", version="2.2.0"), 
    Pkg.PackageSpec(name="MLJFlux", version="0.1.10"), 
    Pkg.PackageSpec(name="MLJClusteringInterface", version="0.1.4"),
    Pkg.PackageSpec(name="LightGBM", version="0.5.2"),
    Pkg.PackageSpec(name="MLJGLMInterface", version="0.1.4"),
    #
    Pkg.PackageSpec(name="Plots", version="1.12.0"),
])

I still get a strange error when I load all models the first time (it works the second time)

# load_m() Install/Load all wrapped models 
@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
            println(model.name, " ", model.package_name)
            try
                T = @eval @load $(model.name) pkg=$(model.package_name) add = true #add=true # verbosity=0
                push!(model_types, T)    
            catch e
                return e
            end 
        end     
    end
    model_instances = map(T -> T(), model_types)
end
models() |> load_m # Load all models

...
XGBoostRegressor XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
ERROR: MethodError: no method matching MLJScikitLearnInterface.ARDRegressor()
The applicable method may be too new: running in world age 29729, while current world is 30240.
Closest candidates are:
  MLJScikitLearnInterface.ARDRegressor(; n_iter, tol, alpha_1, alpha_2, lambda_1, lambda_2, compute_score, threshold_lambda, fit_intercept, normalize, copy_X, verbose) at none:0 (method too new to be called from this world context.)
Stacktrace:
 [1] (::var"#5#6")(T::Type{MLJScikitLearnInterface.ARDRegressor})
   @ Main .\REPL[9]:24
 [2] iterate
   @ .\generator.jl:47 [inlined]
 [3] _collect(c::Vector{Any}, itr::Base.Generator{Vector{Any}, var"#5#6"}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
   @ Base .\array.jl:691
 [4] collect_similar
   @ .\array.jl:606 [inlined]
 [5] map
   @ .\abstractarray.jl:2294 [inlined]
 [6] load_m
   @ .\REPL[9]:24 [inlined]
 [7] |>(x::Vector{NamedTuple{(:name, :package_name, :is_supervised, :docstring, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :prediction_type, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :input_scitype, :target_scitype, :output_scitype), T} where T<:Tuple}, f::typeof(load_m))
   @ Base .\operators.jl:858
 [8] top-level scope
   @ REPL[10]:1
ablaom commented 3 years ago

MLJ has MLJBase and MLJModels as dependencies, so there shouldn't be a need to explicitly add MLJBase and MLJModels to the environment, right? Any methods of MLJBase and MLJModels of use to the general users are re-exported by MLJ (if you find one that is not, post an issue).

ablaom commented 3 years ago

Curious why you need to explicitly import PyCall and Conda (although understand they are likely going to be in the Manifest).

azev77 commented 3 years ago

I need to import the older versions of Conda & PyCall for sklearn to work

ablaom commented 3 years ago

The difference between your versions and the latest are a single patch release. And the patches don't appear to make changes to Project.toml files. So this is a little surprising. Mmm.