JuliaAI / MLJ.jl

A Julia machine learning framework
https://juliaai.github.io/MLJ.jl/
Other
1.8k stars 156 forks source link

Error with `RecursiveFeatureElimination` + `EvoTreeClassifier` #1145

Open LucasMatSP opened 2 weeks ago

LucasMatSP commented 2 weeks ago

Problem When using RecursiveFeatureElimination based on a EvoTreeClassifier model, I get the following error during fitting:

┌ Error: Problem fitting the machine machine(ProbabilisticRecursiveFeatureElimination(model = EvoTrees.EvoTreeClassifier{EvoTrees.MLogLoss}
│  - nrounds: 100
│  - L2: 0.0
│  - lambda: 0.0
│  - gamma: 0.0
│  - eta: 0.1
│  - max_depth: 6
│  - min_weight: 1.0
│  - rowsample: 1.0
│  - colsample: 1.0
│  - nbins: 64
│  - alpha: 0.5
│  - tree_type: binary
│  - rng: Random.MersenneTwister(123, (0, 6012, 5010, 352))
│ , …), …).
└ @ MLJBase C:\Users\user\.julia\packages\MLJBase\7nGJF\src\machines.jl:694
[ Info: Running type checks... 
[ Info: Type checks okay.
ERROR: LoadError: MethodError: Cannot `convert` an object of type String to an object of type Symbol
The function `convert` exists, but no method is defined for this combination of argument types.   

Closest candidates are:
  Symbol(::String)
   @ Core boot.jl:618
  Symbol(::AbstractString)
   @ Base strings\basic.jl:228
  Symbol(::Any...)
   @ Base strings\basic.jl:229
  ...

Stacktrace:
  [1] setindex!(A::Vector{Symbol}, x::String, i::Int64)
    @ Base .\array.jl:976
  [2] score_features!(scores_dict::Dict{…}, features::Vector{…}, importances::Vector{…}, n_features_to_score::Int64)
    @ FeatureSelection C:\Users\user\.julia\packages\FeatureSelection\uPgNd\src\models\rfe.jl:261
  [3] fit(::FeatureSelection.ProbabilisticRecursiveFeatureElimination{…}, ::Int64, ::DataFrame, ::CategoricalArrays.CategoricalVector{…})
    @ FeatureSelection C:\Users\user\.julia\packages\FeatureSelection\uPgNd\src\models\rfe.jl:328
  [4] fit_only!(mach::Machine{…}; rows::Nothing, verbosity::Int64, force::Bool, composite::Nothing)
    @ MLJBase C:\Users\user\.julia\packages\MLJBase\7nGJF\src\machines.jl:692
  [5] fit_only!
    @ C:\Users\user\.julia\packages\MLJBase\7nGJF\src\machines.jl:617 [inlined]
  [6] #fit!#63
    @ C:\Users\user\.julia\packages\MLJBase\7nGJF\src\machines.jl:789 [inlined]
  [7] fit!(mach::Machine{…})
    @ MLJBase C:\Users\user\.julia\packages\MLJBase\7nGJF\src\machines.jl:786
  [8] top-level scope
    @ C:\Users\user\src\case 1\mwe.jl:23
  [9] include(fname::String)
    @ Main .\sysimg.jl:38
 [10] top-level scope
    @ REPL[8]:1
in expression starting at C:\Users\user\src\case 1\mwe.jl:23
Some type information was truncated. Use `show(err)` to see complete types.

Reproduce

using DataFrames, MLJ, ScientificTypesBase
# Load models
EvoTreeClassifier = @load EvoTreeClassifier pkg = EvoTrees
RFclassifier = @load RandomForestClassifier pkg = DecisionTree
# Data
df = DataFrame(rand(1:10, (62, 47)), :auto)
# Set inputs and outputs
inputs = df[:, 16:end-1]
outputs = df[:, 2]
# Coerce
inputNames = names(inputs)
continuousData = ScientificTypesBase.Continuous
inputs = coerce(
  inputs,
  Dict(
    Symbol(col) => continuousData for col in inputNames
  )
)
outputs = coerce(Int.(outputs), Binary)
# Feature selection: gradient boost
rfe_gboost = RecursiveFeatureElimination(EvoTreeClassifier())
rfeGBmach = machine(rfe_gboost, inputs, outputs)
fit!(rfeGBmach)

Versions

DataFrames 1.7.0 MLJ 0.20.7 ScientificTypesBase 3.0.0 Julia 1.11.0 Platform Info: OS: Windows (x86_64-w64-mingw32) CPU: 12 × 13th Gen Intel(R) Core(TM) i7-1365U WORD_SIZE: 64 LLVM: libLLVM-16.0.6 (ORCJIT, goldmont) Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)

Obs.: RecursiveFeatureElimination with RandomForestClassifier works fine. And EvoTreeClassifier by itself as well

OkonSamuel commented 1 week ago

Thanks @LucasMatSP for reporting this issue. I'll look into it.