bbalasub1 / glmnet_python

GNU General Public License v3.0
199 stars 94 forks source link

Fix one-column offset for binomial models #52

Open DexGroves opened 3 years ago

DexGroves commented 3 years ago

Thanks so much for bringing this package to Python! Thanks especially for including all the distributions, it's a huge help.

One-column offset support for binomial models seems broken:

import numpy as np
from glmnet_python import glmnet

N = 100
p = 5

X = np.random.normal(size=(N, p))
y = (np.random.uniform(size=N) > 0.5).astype(float)
w = np.random.uniform(size=(N, 1))

offset = np.random.normal(size=(N, 1))

fit = glmnet(x=X, y=y, weights=w, family="binomial", offset=offset)

Throws:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-3390e77801ef> in <module>
     12 offset = np.random.normal(size=(N, 1))
     13 
---> 14 fit = glmnet(x=X, y=y, weights=w, family="binomial", offset=offset)

~/git/glmnet_python/glmnet_python/glmnet.py in glmnet(x, y, family, **options)
    451     elif (family == 'binomial') or (family == 'multinomial'):
    452         # call lognet
--> 453         fit = lognet(x, is_sparse, irs, pcs, y, weights, offset, parm,
    454                      nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam,
    455                      thresh, isd, intr, maxit, kopt, family)

~/git/glmnet_python/glmnet_python/lognet.py in lognet(x, is_sparse, irs, pcs, y, weights, offset, parm, nobs, nvars, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh, isd, intr, maxit, kopt, family)
     72         if nc == 1:
     73             if do[1] == 1:
---> 74                 offset = scipy.column_stack((offset, -offset), 1)
     75             if do[1] > 2:
     76                 raise ValueError('offset should have 1 or 2 columns in binomial call to glmnet')

~/anaconda3/envs/football/lib/python3.8/site-packages/scipy/_lib/deprecation.py in call(*args, **kwargs)
     18             warnings.warn(msg, category=DeprecationWarning,
     19                           stacklevel=stacklevel)
---> 20             return fun(*args, **kwargs)
     21         call.__doc__ = msg
     22         return call

<__array_function__ internals> in column_stack(*args, **kwargs)

TypeError: _column_stack_dispatcher() takes 1 positional argument but 2 were given

column_stack (since at least 2006?), doesn't seem to take a second argument, and the code works as expected if you take it out.