sylvaticus / BetaML.jl

Beta Machine Learning Toolkit
MIT License
92 stars 14 forks source link

MLJ interface for `KernelPerceptronClassifier` is not tracking all target levels #31

Closed ablaom closed 2 years ago

ablaom commented 2 years ago
julia> using MLJ

julia> Model = @load KernelPerceptronClassifier
[ Info: For silent loading, specify `verbosity=0`. 
import BetaML ✔
BetaML.Perceptron.KernelPerceptronClassifier

julia> model = Model()
KernelPerceptronClassifier(
  K = BetaML.Utils.radialKernel, 
  maxEpochs = 100, 
  initialα = Int64[], 
  shuffle = false, 
  rng = Random._GLOBAL_RNG())

julia> X = (x=rand(10), );

julia> y = coerce(collect("abababababcc"), Multiclass)[1:10];

julia> unique(y)
2-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)

julia> levels(y)
3-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)

julia> mach = machine(model, X, y) |> fit!;
[ Info: Training machine(KernelPerceptronClassifier(K = radialKernel, …), …).

julia> predict_mode(mach, X) |> levels
2-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)

That last indicates a bug, as all levels in the pool of the training vector should be present in the pool of the predictions.

Curiously in other classifiers I looked at, the levels are indeed being tracked correctly. So perhaps have a look at, eg, the BetaML DecisionTreeClassifier to see how this can be corrected.

This bug is causing a failure when the model is bagged in an ensemble using EnsembleModel because some classes are not present in some of the bagged observations, but are present in others.

sylvaticus commented 2 years ago

ok, thanks, I'll have a look on this in the (EU) afternoon... cheers

sylvaticus commented 2 years ago

Solved in the newly released v 0.6 (also for PerceptronClassifier and PegasosClassifier).

ablaom commented 2 years ago

Thanks @sylvaticus for this speedy fix. Much appreciated.