mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
3k stars 401 forks source link

Problem about param ml_task="regression", #643

Open jiaqizheng2000 opened 1 year ago

jiaqizheng2000 commented 1 year ago

If I use this param, it will raise an issue below for all of the model, if it is deleted, the model works fine.

'<' not supported between instances of 'numpy.ndarray' and 'str' Traceback (most recent call last): File "C:\Users\ZHENGJ\AppData\Local\Programs\Python\Python39\lib\site-packages\supervised\base_automl.py", line 1195, in _fit trained = self.train_model(params) File "C:\Users\ZHENGJ\AppData\Local\Programs\Python\Python39\lib\site-packages\supervised\base_automl.py", line 404, in train_model self.keep_model(mf, model_subpath) File "C:\Users\ZHENGJ\AppData\Local\Programs\Python\Python39\lib\site-packages\supervised\base_automl.py", line 317, in keep_model self.select_and_save_best() File "C:\Users\ZHENGJ\AppData\Local\Programs\Python\Python39\lib\site-packages\supervised\base_automl.py", line 1315, in select_and_save_best self._best_model = min( TypeError: '<' not supported between instances of 'numpy.ndarray' and 'str'

Benjamin-Frost commented 8 months ago

Same issue here. Did you manage to resolve it?

pplonski commented 8 months ago

Hi @jiaqizheng2000, @Benjamin-Frost, Could you please provide code to reproduce the issue? Thank you!

jiaqizheng2000 commented 8 months ago
automl = AutoML(
                ml_task="regression",
                train_ensemble=True,
                fairness_threshold=0.8,
                results_path=resultpath,
                model_time_limit=30 * 60,
                start_random_models=10,
                top_models_to_improve=3,
                hill_climbing_steps=3,
                golden_features=True,
                features_selection=False,
                stack_models=True,
                explain_level=2,
                validation_strategy={
                    "validation_type": "kfold",
                    "k_folds": 4,
                    "shuffle": False,
                    "stratify": True,}
     )
automl.fit(x_train, y_train)
pplonski commented 8 months ago

Could you please share dataset as well? or maybe data sample, do you have this error on synthetic data as well?

jiaqizheng2000 commented 8 months ago

example.xlsx

pplonski commented 8 months ago

Thank you! How do you load code and prepare X_train and y_train variables?

jiaqizheng2000 commented 8 months ago

Simply choose the first column as y_train, and the rest as X_train

Rohan581 commented 7 months ago

@pplonski I Would like to work on this issue

pplonski commented 7 months ago

Sure @Rohan581, thanks!