JuliaAI / MLJLinearModels.jl

Generalized Linear Regressions Models (penalized regressions, robust regressions, ...)
MIT License
81 stars 13 forks source link

Multinomial regressor requires integer categories, and greater than zero #58

Closed aquaresima closed 2 years ago

aquaresima commented 4 years ago

Hi, I faced this issue when using the multinomial regression (aka multiclass Logistic Regression). In the following, I show an example with MNIST classification.

I have to shift all categories of 1 integer (0->1, 1->2 .... 9->10) otherwise it gave the error I report in the end notes

# Classify MNIST digits with a simple multi-layer-perceptron
using Flux.Data.MNIST
using MLJLinearModels
# Get MNIST dataset and transpose for (records, features)
imgs = MNIST.images()
X = Array(transpose(hcat(float.(reshape.(imgs, :))...) )
# MNIST labels: Categorical labels must be 1...c, hence add .+1 to each label
labels = MNIST.labels() .+1
# and the number of classes
n_classes = length(Set(labels))
n_features = size(X,2)
# The MNIST database does not need the intercept
intercept = false
# deploy MultinomialRegression from MLJLinearModels, λ being the strenght of the reguliser
mnr = MultinomialRegression(λ; fit_intercept=intercept)
# Fit the model
θ  = fit(mnr, X, labels)
# The model parameters are organized such we can apply X⋅θ, the following is only to clarify
params = reshape(θ, n_features +Int(intercept), n_classes)
# Get the predictions X⋅θ 
preds = MLJLinearModels.softmax(MLJLinearModels.apply_X(X,θ,n_classes))
# map each vector to its maximal element 
targets = map(x->argmax(x),eachrow(preds))
#and evaluate the model over the labels
scores = 1- sum(targets-labels)/length(preds)

Error with 0 among the categories:


DimensionMismatch("new dimensions (785, 10) must be consistent with array size 7056")
(::Base.var"#throw_dmrsa#197")(::Tuple{Int64,Int64}, ::Int64) at reshapedarray.jl:41
reshape at reshapedarray.jl:45 [inlined]
reshape at reshapedarray.jl:116 [inlined]
apply_X!(::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,1}, ::Int64) at utils.jl:66
(::MLJLinearModels.var"#102#103"{GeneralizedLinearRegression{MultinomialLoss,ScaledPenalty{LPPenalty{2}}},Array{Float64,2},Array{Int64,1},Int64,Int64,Int64,Float64})(::Float64, ::Array{Float64,1}, ::Array{Float64,1}) at d_logistic.jl:149
(::NLSolversBase.var"#61#62"{NLSolversBase.InplaceObjective{Nothing,MLJLinearModels.var"#102#103"{GeneralizedLinearRegression{MultinomialLoss,ScaledPenalty{LPPenalty{2}}},Array{Float64,2},Array{Int64,1},Int64,Int64,Int64,Float64},Nothing,Nothing,Nothing},Float64})(::Array{Float64,1}, ::Array{Float64,1}) at incomplete.jl:45
value_gradient!!(::NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}) at interface.jl:82
initial_state(::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}, ::Optim.Options{Float64,Nothing}, ::NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}) at l_bfgs.jl:158
optimize(::NLSolversBase.OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}, ::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}, ::Optim.Options{Float64,Nothing}) at optimize.jl:33
#optimize#93 at interface.jl:116 [inlined]
optimize(::NLSolversBase.InplaceObjective{Nothing,MLJLinearModels.var"#102#103"{GeneralizedLinearRegression{MultinomialLoss,ScaledPenalty{LPPenalty{2}}},Array{Float64,2},Array{Int64,1},Int64,Int64,Int64,Float64},Nothing,Nothing,Nothing}, ::Array{Float64,1}, ::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}, ::Optim.Options{Float64,Nothing}) at interface.jl:115
optimize(::NLSolversBase.InplaceObjective{Nothing,MLJLinearModels.var"#102#103"{GeneralizedLinearRegression{MultinomialLoss,ScaledPenalty{LPPenalty{2}}},Array{Float64,2},Array{Int64,1},Int64,Int64,Int64,Float64},Nothing,Nothing,Nothing}, ::Array{Float64,1}, ::Optim.LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#19#21"}) at interface.jl:115
_fit(::GeneralizedLinearRegression{MultinomialLoss,ScaledPenalty{LPPenalty{2}}}, ::LBFGS, ::Array{Float64,2}, ::Array{Int64,1}) at newton.jl:114
#fit#144(::LBFGS, ::typeof(fit), ::GeneralizedLinearRegression{MultinomialLoss,ScaledPenalty{LPPenalty{2}}}, ::Array{Float64,2}, ::Array{Int64,1}) at default.jl:48
fit(::GeneralizedLinearRegression{MultinomialLoss,ScaledPenalty{LPPenalty{2}}}, ::Array{Float64,2}, ::Array{Int64,1}) at default.jl:38
top-level scope at test_LR.jl:159
tlienart commented 4 years ago

hello and sorry for the very late reply, I hadn't seen this issue.

Yes it's by design that labels must be +-1 for binary and 1...c for multiclass (it's in the README but I know there should be docs etc). This makes computations of the softmax more direct. MLJLinearModels doesn't bother with encoding because users are expected to use MLJ or MLJBase to do their data preprocessing before calling this package.

if you use MLJ as a way to call MLJLinearModels then you never have to think about these things because the number you get (0, 1, ... ) in your target vector are just considered as labels, the proper encoding is done behind the scene.