civisanalytics / python-glmnet

A python port of the glmnet package for fitting generalized linear models via penalized maximum likelihood.
Other
262 stars 59 forks source link

model.fit fails due to reshaping error in predict_proba() #25

Closed Visdoom closed 5 years ago

Visdoom commented 7 years ago

Hey there, I've been trying to fit an Elastic Net with your toolbox and ran into an error:

In the logistic.py class in the _predictproba() function you have the following code:

       z = self.decision_function(X, lamb)
        expit(z, z)
       # z = np.atleast_2d(z)

        # reshape z to (n_samples, n_classes, n_lambda)
        n_lambda = len(np.atleast_1d(lamb))
        z = z.reshape(z.shape[0], -1, n_lambda)

However, when the passed X is only one-dimensional and let's say n_lambda = 86, then z.shape() will return the number of lambdas ( as in (86,) , not (1,86)). Which leads the reshape to fail since it tries to shape a 1x86 array into an 86xKx86 array.

As you can see, I added the z = np.atleast_2d(z) line which takes care of the reshaping problem. However, then I this kind of error:

/usr/local/lib/python3.4/dist-packages/glmnet/logistic.py in predict(self, X, lamb)
    478         indices = scores.argmax(axis=1)
    479 
--> 480         return self.classes_[indices]
    481 
    482     def score(self, X, y, lamb=None):

IndexError: index 85 is out of bounds for axis 1 with size 2

since then the output is apparently not in the expected shape anymore. I believe, this error could be fixed with a simple axis=0 in line 478, but I do not have the overview so I thought it's better to report back to you.

Best, Sophie