mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
2.97k stars 392 forks source link

Google colab - Feature selection not working #717

Open AlizeeL opened 2 months ago

AlizeeL commented 2 months ago

This is my setting: dataframe dataset, numerical values, Target is binary classification, I am trying to do feature selectection. automl = AutoML( mode = 'Compete', eval_metric = 'f1', validation_strategy = {"validation_type": "custom"}, results_path=folder+'automlfeatsel2'+subject_val, explain_level = 1, golden_features = False, algorithms = ['Xgboost'], features_selection = True, stack_models = False, hill_climbing_steps = 0, top_models_to_improve = 5, train_ensemble = False, start_random_models = 1, kmeans_features = False, random_state = 42 )

Hello, I get the following warning when I fit:

log_loss_eps() got an unexpected keyword argument 'response_method' Problem during computing permutation importance. Skipping ... 'module' object is not callable

Skip features_selection because no parameters were generated.

pplonski commented 2 months ago

thanks @AlizeeL for reporting, it looks like a bug.

May I ask why are you using Colab? do you need a lot of computational power?

AlizeeL commented 2 months ago

Some of the datasets I am using can be quite big so yes. Using Colab is a side of my research on accessibility to such tools to non-expert users.

pplonski commented 2 months ago

@AlizeeL thanks for response, we are working on notebook with UI for code generation, that is designed for non-experts users. It is called MLJAR Studio, available as desktop app on our website https://mljar.com/ It is in early development phase, but csv data loading and AutoML training is working. I hope you will find it interesting.

AlizeeL commented 2 months ago

Thanks @pplonski , it does look promising.

Do you know if my type of issue might get solved in the near future? I just need to know in case I have to work on a machine instead of Colab.

pplonski commented 2 months ago

Thank you. I'm adding @Bocianski to disscussion about plans for fix. For sure, it will help us a lot, if you could provide full code and data for reproduction.

AlizeeL commented 2 months ago

Here's my code. There's a link at the top to a dataset. It's a reduced version of my dataset to avoid long computational time. It doesn't change the issues/output.

Let me know if there's any problems with the links, I can send code/data by email if that's the case. Thanks :)

Reese-Martin commented 2 months ago

i am seeing a similar error while running MLJar on an azure VM, do y'all know why this may be happening?

this is the specific error "log_loss_eps() got an unexpected keyword argument 'response_method' Problem during computing permutation importance. Skipping ..."