Open mamitsu2 opened 5 years ago
Thanks for the comments.
Could you share the dataset you used and your setup? Then, we can reproduce the bug.
Thanks!
Sorry to be inadequate. I uses sklearn.datesets.load_boston to test this module. If I do as follows, bug doesn't occur.
dataset = load_boston()
# set dataframe
X1_ = pd.DataFrame(dataset.data, columns=dataset.feature_names)
y1_ = pd.DataFrame(dataset.target, columns=['y'])
X1_ = X1_.iloc[:,:]
X1 = np.array(X1_)
y1 = np.array(y1_)
X1_col = X1_.columns
hsic_lasso = HSICLasso()
hsic_lasso.input(X1,y1.flatten(),featname=list(X1_col))
hsic_lasso.regression(num_feat=X1.shape[1], discrete_x=False, n_jobs=2)
hsic_lasso.dump()
hsic_lasso.get_index_score()
but, I do as follows, reduce explanatory variables,
X1_ = X1_.iloc[:,:5]
ValueError: attempt to get argmax of an empty sequence is occured.
A bug occurs here.
~/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/pyHSICLasso/nlars.py in nlars(X, X_ty, num_feat, max_neighbors)
97 XtXbeta = np.dot(X.transpose(), np.dot(X, beta))
98 c = X_ty - XtXbeta
---> 99 j = np.argmax(c[I])
100 C = max(c[I])
101
Thanks for the detailed information. We will investigate this case.
I've got the same problem with another dataset with few explanatory variables.
Then I've replicated the error with sklearn.datasets.load_boston
and 5 features. It seems that I
array gets empty ([]) when lasso_cond=0
. And this exception is not controlled on the while loop or compensated anyway.
Any hint to fix this issue? I think the library is very interesting, and HSIC-based optimization may be useful too for datasets with few columns.
Thank you!
Thanks for your input. We have been looking at alternative Lasso solvers. Unfortunately, we haven't found one that checks all of our boxes... We'll be on the lookout for a new solvers that would address this issue.
Thanks for reporting the bug. Has anyone solved it?
There are few explanatory variables, so bug occurs. Please fix it!