lasso regression results do not match with scikit-learn or R when lambda is bigger

memoryz commented 3 years ago

The lasso regression from breeze does not match with scikit-learn or R (glmnet).

Testing code:

breeze


import breeze.linalg.{norm, DenseMatrix => BDM, DenseVector => BDV}
import breeze.stats.regression.lasso

val a = BDM((1.0, 1.0), (2.0, -2.0), (3.0, 3.0), (4.0, 5.0)) val b = BDV(1.01, 1.98, 3.03, 4.05) val lambda = 0.5d val result = lasso(a, b, lambda) println(result.coefficients)


- scikit-learn
```python
import numpy as np
from sklearn import linear_model
X = np.array([[1, 1], [2, -2], [3, 3], [4, 5]])
y = np.array([1.01, 1.98, 3.03, 4.05])
clf = linear_model.Lasso(alpha=0.5, fit_intercept = False)
clf.fit(X, y)
print(clf.coef_)

R

library(glmnet)
X = matrix(c(1, 1, 2, -2, 3, 3, 4, 5), nrow = 4, ncol=2, byrow = TRUE)
y = c(1.01, 1.98, 3.03, 4.05)
coef(glmnet(X, y, alpha=1, lambda=0.5, intercept=TRUE))

Results:	lambda	breeze	scikit-learn
0.01	0.9968421052631582, 0.0021	0.99905151, 0.00960668	1.002109388,0.005941413
0.05	0.9586666666666666, 0.0	0.99482704, 0.00832043	1.001213, 0.0
0.1	0.9086666666666666, 0.0	0.98959963, 0.00667717	0.9937595, 0.0
0.5	0.5086666666666666, 0.0	0.942, 0.0	0.9341311, 0.0
1.0	0.0086666666666666, 0.0	0.87533333, 0.0	0.8595955, 0.0

scikit-learn more or less agrees with R, but breeze's results differ quite a lot when lambda is bigger.

tsweetser-hbo commented 1 year ago

In your R example, did you mean intercept=FALSE, consistent with the Python example?

tsweetser-hbo commented 1 year ago

I looked at this, and am not sure what objective function Breeze's lasso is using. I am not familiar with that implementation of Lasso shrinkage, but maybe it's somehow equivalent to the usual form (from scikit-learn) objective-function ?

scalanlp / breeze

lasso regression results do not match with scikit-learn or R when lambda is bigger #812