haifengl / smile

Statistical Machine Intelligence & Learning Engine
https://haifengl.github.io
Other
5.99k stars 1.12k forks source link

Numerical differences with GLMNET in R for Lasso #711

Closed stanfordstrickland closed 2 years ago

stanfordstrickland commented 2 years ago

Describe the bug Numerical differences with GLMNET on R when running various lasso regression analysis (LASSO.fit with defaults params except lambda)

Expected behavior When setting lambda to zero I expect it to match standard OLS. Also driving lambda large should set coefficients to zero but doesnt appear to be.

Actual behavior Setting lambda=0 In R: x <- data.matrix(mtcars[c("mpg", "wt", "drat")]) y <- mtcars$hp

fit_ols <- lm(y ~x) Coefficients: (Intercept) xmpg xwt xdrat
270.869 -9.860 1.087 19.576

fit <- glmnet(x,y, family="gaussian", alpha = 1, lambda = 0.0) coef(fit) s0 (Intercept) 271.027661 mpg -9.865943 wt 1.060851 drat 19.590868

In Smile: (Intercept) xmpg xwt xdrat
146.6875 0 0 0

Setting lambda=1 In R: fit <- glmnet(x,y, family="gaussian", alpha = 1, lambda = 1.0) (Intercept) 289.334189 mpg -9.455969 wt .
drat 13.159685

In Smile: (Intercept) xmpg xwt xdrat
271.0428566216602 -9.85148988995701 1.08611366522839 19.483307535821027

Setting lambda=4 In R: fit <- glmnet(x,y, family="gaussian", alpha = 1, lambda = 5.0) (Intercept) 307.148291 mpg -7.986849 wt .
drat .

In Smile: (Intercept) xmpg xwt xdrat 272.52477804190545 -9.832302034900476 0.9697086497929934 19.068213015753336

Also I've notice that sklearn and R scale the objective function

(1 / (2 n_samples)) ||y - Xw||^2_2 + alpha * ||w||_1

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

haifengl commented 2 years ago

First of all, do you use same alpha = 1 for smile? Since sklearn and R scale the objective function, we should use 2 * n * alpha for Smile (i.e. 64). Besides, GLMNET implements Elastic Net, not plain LASSO. Please try ElasticNet class in smile. Thanks.