scikit-learn-contrib / boruta_py

Python implementations of the Boruta all-relevant feature selection method.
BSD 3-Clause "New" or "Revised" License
1.46k stars 252 forks source link

AttributeError: module 'numpy' has no attribute 'bool' when using BorutaPy with RandomForestClassifier #120

Open codewithawr opened 9 months ago

codewithawr commented 9 months ago

I encountered an error while using BorutaPy with a RandomForestClassifier model. The error message states "AttributeError: module 'numpy' has no attribute 'bool'." The error suggests that BorutaPy is trying to use numpy.bool, which was a deprecated alias for the built-in bool. To avoid this error in existing code, we should use bool by itself. If the specific intention was to use the numpy scalar type, we should use numpy.bool_.

Steps to Reproduce:

Bach

pip install numpy==1.25.2 Boruta==0.3 scikit-learn==1.3.0 pandas==2.1.0

Code

import numpy as np
from boruta import BorutaPy
from sklearn.ensemble import RandomForestClassifier

# Create a sample dataset with boolean features and labels
num_samples = 100
num_features = 5

# Generate random binary features (0 or 1)
x_train_rfe = np.random.randint(2, size=(num_samples, num_features), dtype=bool)

# Generate random binary labels (0 or 1)
y_train = np.random.randint(2, size=num_samples)

# Create a BorutaPy instance with a RandomForestClassifier
bl_model_ob = RandomForestClassifier(n_jobs=-1, max_depth=5, random_state=1) 
boruta_selection = BorutaPy(estimator=bl_model_ob, n_estimators='auto', verbose=2, random_state=1)

# Attempt to fit the BorutaPy model with the boolean dataset
boruta_selection.fit(x_train_rfe, y_train)

Behavior:

Error

--> 372     depth = self.estimator.get_params()['max_depth']
    373     if depth == None:
    374         depth = 10

KeyError: 'max_depth'

Additional Information:

Python Version: 3.9.12 Operating System: Windows

codewithawr commented 9 months ago

hear what i am doing for it to work

# Reassigning np.int to np.int64
np.int = np.int64

# Reassigning np.float to np.float64
np.float = np.float64

# Reassigning np.bool to np.bool_
np.bool = np.bool_

it a temporary solution might help someone

ErikHartman commented 9 months ago

There is an open PR fixing this. Basically just change all np.type to np.type_

It seems like no one is approving PRs though.

VEZcoding commented 8 months ago

Someone should indeed updated this package so it won't clash with changes to numpy.