scikit-learn-contrib / boruta_py

Python implementations of the Boruta all-relevant feature selection method.
BSD 3-Clause "New" or "Revised" License
1.46k stars 252 forks source link

iteration over a 0-d array #43

Closed Leninstark closed 5 years ago

Leninstark commented 5 years ago

Hi Team - I have faced this "iteration over a 0-d array" for a specific data set and read all the QA and understood it is fixed ( if i am right ). But it seems problem persists for A dataset(wine) . There is NO nan values in any rows/columns or full array of nan values.but i am facing this issue.

It would of great help if u guide me on this , unless i am wrongly coded. Thanks Here is the dataset and code wine.csv.zip FEATURE_SELECTION_BORUTA.py.zip

ihopethiswillfi commented 5 years ago

Same problem when using:


from sklearn.ensemble import RandomForestRegressor

data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names).values
y = pd.Series(data.target, name='label').values

rf = RandomForestRegressor(n_jobs=-1, max_depth=5)
feat_selector = BorutaPy(rf, n_estimators=100, verbose=3, random_state=2)
feat_selector.fit(X, y)```
salml commented 5 years ago

I'm having a similar issue on this Kaggle dataset which is really easy to predict (.99 f1 score with a random forest, default hyperparameters). Judging by the output, it may be because all of the features are relevant?

danielhomola commented 5 years ago

this should been handled now with the latest pr, thanks to @guitarmind

jon-mic commented 5 years ago

I am still facing this issue with the iris dataset. I installed the most recent version of BorutaPy directly from github and get the error if no features are rejected.

from sklearn.datasets import load_iris

iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                     columns= iris['feature_names'] + ['target'])
X = iris.drop('target',1).values
y = iris['target'].values

Adding iris['test'] = 1 an arbitrary column that leads to rejection does not raise the error.

danielhomola commented 5 years ago

I forgot to update the pypy version.. please install the latest version from github directly.