L1 and standarization - Githubissues

jbrea commented 2 years ago

fit!(machine(MLJLinearModels.LogisticClassifier(penalty = :l1), X, y) leads to far worse results than glmnet(X, y, Binomial()) for some datasets. When this happens, MLJLInearModels warns Warning: No appropriate stepsize found via backtracking; interrupting.. It turns out that fit!(machine(@pipeline(Standardizer(), MLJLinearModels.LogisticClassifier(penalty = :l1)), X, y) is sufficient to recover (or even beat) glmnet and the warning disappears. In fact, glmnet standarizes the input by default.

Would it make sense to standardize the input by default in MLJLinearModels? Or maybe change the warning to something like Warning: No appropriate stepsize found via backtracking; interrupting. The reason could be input data that is not standardized.

tlienart commented 2 years ago

Or maybe change the warning to something like Warning: No appropriate stepsize found via backtracking; interrupting. The reason could be input data that is not standardized.

would you kindly add a PR for this?

I think this makes more sense since MLJLM is meant to be a bit of a blackbox only solving specific optimisation problems and not really doing any of the adjustments that MLJ would help the user to apply (standardisation, imputation or whatever).

Thanks!

tlienart commented 2 years ago

Ah no but the backtracking note is something that comes from Optim so it'd have to be caught here and then some additional information added. Not sure how to do this. Maybe optim returns some flag if the backtracking fails?

jbrea commented 2 years ago

Ah no but the backtracking note is something that comes from Optim

Doesn't it come from here?

tlienart commented 2 years ago

right, thanks!

JuliaAI / MLJLinearModels.jl

L1 and standarization #106