bbalasub1 / glmnet_python

GNU General Public License v3.0
199 stars 94 forks source link

Issues when intr=False #54

Open lazycal opened 3 years ago

lazycal commented 3 years ago

Hello,

I find that when not fitting intercept (i.e. passing intr=False to glmnet or cvglment), the deviance becomes much larger and often exceeds 1, causing only very few lambdas are searched. Is this a bug? Same issue occurs even after adding 1s column to X to simulate the intercept. Code for reproducing the issue:

import glmnet_python
from glmnet import glmnet; from glmnetPlot import glmnetPlot 
from glmnetPrint import glmnetPrint; from glmnetCoef import glmnetCoef; from glmnetPredict import glmnetPredict
from cvglmnet import cvglmnet; from cvglmnetCoef import cvglmnetCoef
from cvglmnetPlot import cvglmnetPlot; from cvglmnetPredict import cvglmnetPredict
import numpy as np

np.random.seed(0)
x = np.random.rand(1000, 10)
w = np.zeros((10, 1))
w[:3]=1
y = np.matmul(x, w) + np.random.rand(1000, 1)
fit = glmnet(x = x.copy(), y = y.copy())
glmnetPrint(fit)
glmnetPlot(fit, xvar = 'lambda', label = True);

# removing the intercept
fit = glmnet(x = x.copy(), y = y.copy(), intr=False)
glmnetPrint(fit)
glmnetPlot(fit, xvar = 'lambda', label = True);

# add 1s column
n = x.shape[0]
x1 = np.concatenate([x, np.ones(shape=(n, 1))], axis=1)
print('x1.shape=',x1.shape)
fit = glmnet(x = x1.copy(), y = y.copy(), intr=False)
glmnetPrint(fit)
glmnetPlot(fit, xvar = 'lambda', label = True);

Results

     df      %dev    lambdau

0    0.000000    0.000000    0.293214
1    3.000000    0.069980    0.267166
2    3.000000    0.185491    0.243432
3    3.000000    0.281389    0.221806
4    3.000000    0.361006    0.202101
5    3.000000    0.427106    0.184147
6    3.000000    0.481982    0.167788
7    3.000000    0.527542    0.152882
8    3.000000    0.565367    0.139301
9    3.000000    0.596769    0.126926
10   3.000000    0.622840    0.115650
11   3.000000    0.644485    0.105376
12   3.000000    0.662454    0.096015
13   3.000000    0.677373    0.087485
14   3.000000    0.689759    0.079713
15   3.000000    0.700042    0.072631
16   3.000000    0.708579    0.066179
17   3.000000    0.715667    0.060300
18   3.000000    0.721551    0.054943
19   3.000000    0.726436    0.050062
20   3.000000    0.730492    0.045615
21   3.000000    0.733859    0.041562
22   3.000000    0.736654    0.037870
23   3.000000    0.738975    0.034506
24   3.000000    0.740902    0.031440
25   3.000000    0.742502    0.028647
26   3.000000    0.743830    0.026102
27   3.000000    0.744933    0.023784
28   3.000000    0.745848    0.021671
29   3.000000    0.746608    0.019746
30   3.000000    0.747239    0.017991
31   3.000000    0.747763    0.016393
32   4.000000    0.748222    0.014937
33   4.000000    0.748696    0.013610
34   4.000000    0.749090    0.012401
35   4.000000    0.749416    0.011299
36   4.000000    0.749688    0.010295
37   4.000000    0.749913    0.009381
38   4.000000    0.750100    0.008547
39   5.000000    0.750264    0.007788
40   6.000000    0.750441    0.007096
41   7.000000    0.750609    0.006466
42   8.000000    0.750768    0.005891
43   8.000000    0.750909    0.005368
44   8.000000    0.751025    0.004891
45   8.000000    0.751122    0.004457
46   8.000000    0.751203    0.004061
47   8.000000    0.751270    0.003700
48   8.000000    0.751325    0.003371
49   8.000000    0.751371    0.003072
50   8.000000    0.751409    0.002799
51   9.000000    0.751443    0.002550
52   9.000000    0.751473    0.002324
53   9.000000    0.751497    0.002117
54   9.000000    0.751517    0.001929
55   10.000000   0.751536    0.001758
56   10.000000   0.751551    0.001602
57   10.000000   0.751564    0.001459
58   10.000000   0.751575    0.001330
59   10.000000   0.751584    0.001212
60   10.000000   0.751592    0.001104

     df      %dev    lambdau

0    0.000000    0.000000    3.621041
1    2.000000    1.951328    3.299358
2    2.000000    3.592739    3.006252
3    3.000000    5.066762    2.739185
4    3.000000    6.293260    2.495843
x1.shape= (1000, 11)

     df      %dev    lambdau

0    0.000000    0.000000    3.621041
1    2.000000    1.951328    3.299358
2    2.000000    3.592739    3.006252
3    3.000000    5.066762    2.739185
4    3.000000    6.293260    2.495843