scalanlp / breeze

Breeze is a numerical processing library for Scala.
www.scalanlp.org
Apache License 2.0
3.44k stars 691 forks source link

lasso regression results do not match with scikit-learn or R when lambda is bigger #812

Open memoryz opened 3 years ago

memoryz commented 3 years ago

The lasso regression from breeze does not match with scikit-learn or R (glmnet).

Testing code:

val a = BDM((1.0, 1.0), (2.0, -2.0), (3.0, 3.0), (4.0, 5.0)) val b = BDV(1.01, 1.98, 3.03, 4.05) val lambda = 0.5d val result = lasso(a, b, lambda) println(result.coefficients)


- scikit-learn
```python
import numpy as np
from sklearn import linear_model
X = np.array([[1, 1], [2, -2], [3, 3], [4, 5]])
y = np.array([1.01, 1.98, 3.03, 4.05])
clf = linear_model.Lasso(alpha=0.5, fit_intercept = False)
clf.fit(X, y)
print(clf.coef_)
Results: lambda breeze scikit-learn R
0.01 0.9968421052631582, 0.0021 0.99905151, 0.00960668 1.002109388,0.005941413
0.05 0.9586666666666666, 0.0 0.99482704, 0.00832043 1.001213, 0.0
0.1 0.9086666666666666, 0.0 0.98959963, 0.00667717 0.9937595, 0.0
0.5 0.5086666666666666, 0.0 0.942, 0.0 0.9341311, 0.0
1.0 0.0086666666666666, 0.0 0.87533333, 0.0 0.8595955, 0.0

scikit-learn more or less agrees with R, but breeze's results differ quite a lot when lambda is bigger.

tsweetser-hbo commented 1 year ago

In your R example, did you mean intercept=FALSE, consistent with the Python example?

tsweetser-hbo commented 1 year ago

I looked at this, and am not sure what objective function Breeze's lasso is using. I am not familiar with that implementation of Lasso shrinkage, but maybe it's somehow equivalent to the usual form (from scikit-learn) objective-function?