juaml / julearn

Forschungszentrum Jülich Machine Learning Library
https://juaml.github.io/julearn
GNU Affero General Public License v3.0
30 stars 19 forks source link

Random Forest as Hyperparameter[BUG] #219

Closed samihamdan closed 1 year ago

samihamdan commented 1 year ago

Describe the bug If I have a model as a hyperparameter and only provide the model thats also an iterable lets say RandomForestRegressor(), then julearn understands it as a iterable because it is one. Therefore, it thinks these are options of hyper parameters which these arent. Current work around is to put the iterable model into a list.

To Reproduce

from julearn import run_cross_validation
from sklearn.ensemble import RandomForestRegressor
import seaborn as sns 
df = sns.load_dataset("iris")
df = (df
      .query("species in ['virginica','versicolor']")
      .sample(frac=1)
      .reset_index(drop=True)
     )
X = df.iloc[:,:-1].columns.tolist()
confounds = [X.pop()]
y = "species"

scores, model = run_cross_validation(
    X=X,
    y=y,
    confounds=confounds,
    data=df,
    model="rf",return_estimator="final", 
    model_params=dict(
        remove_confound__model_confound=RandomForestRegressor()
    )

)

Expected behavior If something is a iterable and model its used as model.

Screenshots image

System (please complete the following information):

fraimondo commented 1 year ago

Have you tried with the new API to see if this still happen?

samihamdan commented 1 year ago

Good question. I think we have the same problem in the PipelineCreator. I checked on branch: julearn_sk_pandas (If I remember correctly this is the newest)

(PipelineCreator(problem_type="classification")
 .add("confound_removal", model_confound=RandomForestRegressor())
)

Same error message and I think its the same fix. Like using [RandomForestRegressor] resolves it because now the iterator is what we expected.

samihamdan commented 1 year ago

solved in #219