scikit-learn / scikit-learn

scikit-learn: machine learning in Python
https://scikit-learn.org
BSD 3-Clause "New" or "Revised" License
60.09k stars 25.4k forks source link

GraphLassoCV breaks with some keyword values #2159

Closed pgervais closed 11 years ago

pgervais commented 11 years ago

The two last lines of this sample break with scikit-learn dev:

import numpy as np
from sklearn.covariance import GraphLassoCV
from sklearn.cross_validation import KFold
from sklearn.datasets import make_sparse_spd_matrix

dim = 5
n_samples = 6
random_state = np.random.RandomState(42)
prec = make_sparse_spd_matrix(dim, alpha=.96,
                              random_state=random_state)
cov = np.linalg.inv(prec)
X = random_state.multivariate_normal(np.zeros(dim), cov, size=n_samples)
GraphLassoCV(verbose=1, alphas=[0.8, 0.5, 0.1], n_jobs=1).fit(X)
GraphLassoCV(cv=KFold(n=X.shape[0], n_folds=5), n_jobs=1, verbose=1).fit(X)

Namely, using a list of values for alphas as advertized in the documentation does not work, and using a cross-validation object with more than three folds breaks as well.

larsmans commented 11 years ago

Confirmed. @GaelVaroquaux knows this part of the code well.

larsmans commented 11 years ago

It breaks when np.logspace is called with an array n_alphas as the third parameter, because logspace expects a scalar.

pgervais commented 11 years ago

I should have said that I read the code and I know how to fix these problems. I'm planning to do that during the sprint next week in Paris.

larsmans commented 11 years ago

Ok, great! :)

pgervais commented 11 years ago

Little correction: using KFold with n_folds=5 does not break GraphLassoCV, but all folds are not explored during the compuration (only 3 folds used in any case).

amueller commented 11 years ago

was that fixed in #2183 ?

pgervais commented 11 years ago

Yes. This issue can be closed.

amueller commented 11 years ago

thanks :)