Closed Leninstark closed 5 years ago
Same problem when using:
from sklearn.ensemble import RandomForestRegressor
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names).values
y = pd.Series(data.target, name='label').values
rf = RandomForestRegressor(n_jobs=-1, max_depth=5)
feat_selector = BorutaPy(rf, n_estimators=100, verbose=3, random_state=2)
feat_selector.fit(X, y)```
I'm having a similar issue on this Kaggle dataset which is really easy to predict (.99 f1 score with a random forest, default hyperparameters). Judging by the output, it may be because all of the features are relevant?
this should been handled now with the latest pr, thanks to @guitarmind
I am still facing this issue with the iris dataset. I installed the most recent version of BorutaPy directly from github and get the error if no features are rejected.
from sklearn.datasets import load_iris
iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
columns= iris['feature_names'] + ['target'])
X = iris.drop('target',1).values
y = iris['target'].values
Adding iris['test'] = 1 an arbitrary column that leads to rejection does not raise the error.
I forgot to update the pypy version.. please install the latest version from github directly.
Hi Team - I have faced this "iteration over a 0-d array" for a specific data set and read all the QA and understood it is fixed ( if i am right ). But it seems problem persists for A dataset(wine) . There is NO nan values in any rows/columns or full array of nan values.but i am facing this issue.
It would of great help if u guide me on this , unless i am wrongly coded. Thanks Here is the dataset and code wine.csv.zip FEATURE_SELECTION_BORUTA.py.zip