Closed Possums closed 3 years ago
Currently only ["auc", "logloss", "rmse", "mae", "mape"]
metrics are supported by Optuna mode. The R2
is not supported because there were no native support for R2
metric in Xgboost
, CatBoost
, and LightGBM
.
This can be fixed by adding a custom eval_metric
.
@Possums do you have other metrics in mind worth adding?
Thanks for the quick response!
I would say other metrics that I would find useful are spearman correlation and RMSLE for regression, and perhaps AUCPR and F1 for classification.
@Possums sounds good, let me check the details which metrics can be added soon
I've added support to r2
metric to Optuna mode. The code is pushed to the dev
branch.
To try it please install mljar-supervised
from the dev
branch:
pip install -U git+https://github.com/mljar/mljar-supervised.git@dev
I will add the rest of the metrics as well (spearman, rmsle, aucpr, f1). For f1
metric there can be two implementations:
0.5
threshold,0.5
threshold version - it is simpler.Thanks for the quick update! I'm currently training a model and haven't run into any errors so far.
I've added:
r2
, spearman
, and pearson
correlation metrics for regression tasks,f1
score for binary and multi-class classification (Micro average),average_precision
for binary classification (AUCPR).I didn't implement RMSLE
- need to update target preprocessing for this to assure positive values only. I will add a ticket for this metric (https://github.com/mljar/mljar-supervised/issues/346).
All changes are in the dev
branch. To install package directly from the dev
branch:
pip install -U git+https://github.com/mljar/mljar-supervised.git@dev
My model just finished training but encountered the following errors.
## Error for 2_Optuna_Xgboost
list index out of range
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1056, in _fit
params["final_loss"] = self._models[-1].get_final_loss()
IndexError: list index out of range
Please set a GitHub issue with above error message at: https://github.com/mljar/mljar-supervised/issues/new
## Error for 3_Optuna_CatBoost
list index out of range
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1056, in _fit
params["final_loss"] = self._models[-1].get_final_loss()
IndexError: list index out of range
Please set a GitHub issue with above error message at: https://github.com/mljar/mljar-supervised/issues/new
## Error for 4_Optuna_RandomForest
list index out of range
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1056, in _fit
params["final_loss"] = self._models[-1].get_final_loss()
IndexError: list index out of range
Please set a GitHub issue with above error message at: https://github.com/mljar/mljar-supervised/issues/new
## Error for 5_Optuna_ExtraTrees
list index out of range
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1056, in _fit
params["final_loss"] = self._models[-1].get_final_loss()
IndexError: list index out of range
Please set a GitHub issue with above error message at: https://github.com/mljar/mljar-supervised/issues/new
Thank you for your help!
Can you reproduce the error? Can you post some code examples to reproduce the error?
The output looks like there is not model trained at all.
import pandas as pd
from supervised.automl import AutoML
automl = AutoML(mode='Optuna',
optuna_time_budget=259200,
ml_task='regression',
eval_metric='r2')
X_train = pd.read_csv('data.csv')
y_train = X_train.pop('target_column')
automl.fit(X_train, y_train)
Here's the code I'm currently using. A model was definitely trained, as it was running at max CPU load for the entire 3-day period. Below is the final output during training.
[I 2021-03-23 16:36:45,359] Trial 2994 finished with value: 0.0051917264869625335 and parameters: {'learning_rate': 0.025, 'num_le[0/1592]
62, 'lambda_l1': 6.989708473974178, 'lambda_l2': 0.05916462162625953, 'feature_fraction': 0.8764481602033438, 'bagging_fraction': 0.957027
3673237645, 'bagging_freq': 3, 'min_data_in_leaf': 52, 'cat_l2': 66.81630949750212, 'cat_smooth': 57.28175791910296}. Best is trial 1889 w
ith value: 0.00591867047625072.
1_Optuna_LightGBM not trained. Stop training after the first fold. Time needed to train on the first fold 153.0 seconds. The time estimate
for training on all folds is larger than total_time_limit.
There was an error during 2_Optuna_Xgboost training.
Please check AutoML_1/errors.md for details.
There was an error during 3_Optuna_CatBoost training.
Please check AutoML_1/errors.md for details.
There was an error during 4_Optuna_RandomForest training.
Please check AutoML_1/errors.md for details.
There was an error during 5_Optuna_ExtraTrees training.
Please check AutoML_1/errors.md for details.
Skip golden_features because no parameters were generated.
Skip insert_random_feature because no parameters were generated.
Skip features_selection because no parameters were generated.
Skip boost_on_errors because no parameters were generated.
2021-03-23 16:39:26,539 supervised.exceptions ERROR No models produced.
Please check your data or submit a Github issue at https://github.com/mljar/mljar-supervised/issues/new.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 12, in <module>
File "/usr/local/lib/python3.8/dist-packages/supervised/automl.py", line 323, in fit
return self._fit(X, y, sample_weight)
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1091, in _fit
raise e
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1010, in _fit
raise AutoMLException(
supervised.exceptions.AutoMLException: No models produced.
Please check your data or submit a Github issue at https://github.com/mljar/mljar-supervised/issues/new.
@Possums I'm afraid that you were running hyperparameters optimization with Optuna for 3 days. After 3 days you have only tuned hyperparameters and no model was trained. The good news is that you should have optuna/optuna.json
file with tuned parameters and you will not need to tune them again - the training will be faster.
To reuse optuna/optuna.json
parameters you need to pass them in the AutoML:
init_params = json.load(open("dir_with_params/optuna/optuna.json", "r"))
automl = AutoML(
#
# the config params ...,
#
optuna_init_params=init_params
)
What we can do?
Compete
mode with small total_time_limit =120
?dev
branch? optuna_time_budget=10
, so it will fail quickly. For example, run it with algorithms=["Xgboost"]
.Thank you again for the support.
Here is the output of optuna.json
{
"original_LightGBM": {
"learning_rate": 0.025,
"num_leaves": 180,
"lambda_l1": 3.818029469419764,
"lambda_l2": 4.779917528983006e-05,
"feature_fraction": 0.7777459702498414,
"bagging_fraction": 0.9485177139222203,
"bagging_freq": 4,
"min_data_in_leaf": 74,
"cat_l2": 57.416326938339566,
"cat_smooth": 41.70937114259941,
"metric": "custom",
"custom_eval_metric_name": "r2",
"num_boost_round": 1000,
"early_stopping_rounds": 50,
"cat_feature": [
1
],
"feature_pre_filter": false,
"seed": 123
}
}
Your first suggestion (compete mode, total_time_limit=120) works fine.
* Step not_so_random will try to check up to 54 models
11_LightGBM r2 0.000189 trained in 55.45 seconds
Skip golden_features because no parameters were generated.
Skip insert_random_feature because no parameters were generated.
Skip features_selection because no parameters were generated.
Skip hill_climbing_1 because of the time limit.
Skip hill_climbing_2 because of the time limit.
* Step ensemble will try to check up to 1 model
Ensemble r2 0.000658 trained in 0.04 seconds
AutoML fit time: 131.9 seconds
AutoML best model: 1_DecisionTree
I am running the newest code from the dev branch.
Here's the output for the 3rd suggestion (optuna_time_budget=10, algorithms=['Xgboost'])
There was an error during 1_Optuna_Xgboost training.
Please check AutoML_3/errors.md for details.
Skip golden_features because no parameters were generated.
Skip insert_random_feature because no parameters were generated.
Skip features_selection because no parameters were generated.
Skip boost_on_errors because no parameters were generated.
2021-03-24 14:22:38,485 supervised.exceptions ERROR No models produced.
Please check your data or submit a Github issue at https://github.com/mljar/mljar-supervised/issues/new.
Traceback (most recent call last):
File "train.py", line 14, in <module>
automl.fit(X_train, y_train)
File "/usr/local/lib/python3.8/dist-packages/supervised/automl.py", line 323, in fit
return self._fit(X, y, sample_weight)
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1092, in _fit
raise e
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1011, in _fit
raise AutoMLException(
supervised.exceptions.AutoMLException: No models produced.
Please check your data or submit a Github issue at https://github.com/mljar/mljar-supervised/issues/new.
And here's errors.md
name 'xgboost_objective' is not defined
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1055, in _fit
trained = self.train_model(params)
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 356, in train_model
mf.train(results_path, model_subpath)
File "/usr/local/lib/python3.8/dist-packages/supervised/model_framework.py", line 188, in train
self.learner_params = optuna_tuner.optimize(
File "/usr/local/lib/python3.8/dist-packages/supervised/tuner/optuna/tuner.py", line 133, in optimize
objective = XgboostObjective(
File "/usr/local/lib/python3.8/dist-packages/supervised/tuner/optuna/xgboost.py", line 54, in __init__
self.objective = xgboost_objective(ml_task, eval_metric.name)
NameError: name 'xgboost_objective' is not defined
When running in Optuna mode, is total_time_limit also necessary? I'm thinking that this could be the issue, since I didn't set that value. Could the library be spending all the time on parameter config, and not actually training models?
@Possums you are a genius! of course, the problem is with total_time_limit
! Ah, I fixed it today in #347 - the Optuna optimization is not counted as the model training time (fixed today). Please try to run it once again. Please get the newest code from the dev
branch.
The example code:
init_params = json.load(open("dir_with_params/optuna/optuna.json", "r"))
automl = AutoML(
#
# the config params ...,
#
total_time_limit=4*3600,
optuna_time_budget=1800,
optuna_init_params=init_params
)
Please run it with small optuna_time_budget
first, maybe 60
seconds, and total_time_limit=1800
.
You're the best! Really appreciate the constant improvements that you're making to mljar. I will try the latest version with a shorter time limit and report back.
AutoML directory: AutoML_2
The task is regression with evaluation metric r2
AutoML will use algorithms: ['Random Forest', 'Extra Trees', 'LightGBM', 'Xgboost', 'CatBoost']
AutoML will stack models
AutoML will ensemble availabe models
AutoML steps: ['simple_algorithms', 'default_algorithms', 'golden_features', 'insert_random_feature', 'features_selection', 'boost_on_errors', 'ensemble', 'stack', 'ensemble_stacked']
Skip simple_algorithms because no parameters were generated.
* Step default_algorithms will try to check up to 5 models
1_Optuna_LightGBM not trained. Stop training after the first fold. Time needed to train on the first fold 220.0 seconds. The time estimate for training on all folds is larger than total_time_limit.
Optuna optimizes Xgboost with time budget 60 seconds eval_metric r2 (maximize)
[I 2021-03-24 15:18:40,405] A new study created in memory with name: no-name-17a54d29-78cb-452b-987c-1decaa7a780e
[I 2021-03-24 15:19:37,287] Trial 0 finished with value: 0.003899558467413411 and parameters: {'eta': 0.1, 'max_depth': 5, 'lambda': 0.003971722885615567, 'alpha': 8.70060897787558e-05, 'colsample_bytree': 0.8497510164532243, 'subsample': 0.8459830734829203, 'min_child_weight': 24}. Best is trial 0 with value: 0.003899558467413411.
[I 2021-03-24 15:20:17,429] Trial 1 finished with value: 0.0015173661332023025 and parameters: {'eta': 0.025, 'max_depth': 12, 'lambda': 6.118836254358573e-07, 'alpha': 0.21700333406861005, 'colsample_bytree': 0.41117074881725907, 'subsample': 0.3812964804943624, 'min_child_weight': 93}. Best is trial 0 with value: 0.003899558467413411.
2_Optuna_Xgboost r2 0.003116 trained in 1013.01 seconds
Skip golden_features because no parameters were generated.
Skip insert_random_feature because no parameters were generated.
Skip features_selection because no parameters were generated.
Skip boost_on_errors because no parameters were generated.
* Step ensemble will try to check up to 1 model
Skip stack because no parameters were generated.
Skip ensemble_stacked because no parameters were generated.
AutoML fit time: 1388.33 seconds
AutoML best model: 2_Optuna_Xgboost
Using the total_time_limit, optuna_time_budget, and optuna_init_params generated a working model! Now I'll experiment with a longer time limit, fingers crossed it works.
## Error for 3_Optuna_CatBoost
Bad value for num_feature[non_default_doc_idx=0,feature_idx=0]="COLUMN_ONE": Cannot convert 'b'COLUMN_ONE'' to float
Traceback (most recent call last):
File "_catboost.pyx", line 1980, in _catboost.get_float_feature
File "_catboost.pyx", line 1085, in _catboost._FloatOrNan
File "_catboost.pyx", line 917, in _catboost._FloatOrNanFromString
TypeError: Cannot convert 'b'COLUMN_ONE'' to float
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1055, in _fit
trained = self.train_model(params)
File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 356, in train_model
mf.train(results_path, model_subpath)
File "/usr/local/lib/python3.8/dist-packages/supervised/model_framework.py", line 188, in train
self.learner_params = optuna_tuner.optimize(
File "/usr/local/lib/python3.8/dist-packages/supervised/tuner/optuna/tuner.py", line 146, in optimize
objective = CatBoostObjective(
File "/usr/local/lib/python3.8/dist-packages/supervised/tuner/optuna/catboost.py", line 42, in __init__
self.eval_set = Pool(
File "/usr/local/lib/python3.8/dist-packages/catboost/core.py", line 455, in __init__
self._init(data, label, cat_features, text_features, embedding_features, pairs, weight, group_id, group_weight, subgroup_id, pairs_weight, baseline, feature_names, thread_count)
File "/usr/local/lib/python3.8/dist-packages/catboost/core.py", line 966, in _init
self._init_pool(data, label, cat_features, text_features, embedding_features, pairs, weight, group_id, group_weight, subgroup_id, pairs_weight, baseline, feature_names, thread_count)
File "_catboost.pyx", line 3550, in _catboost._PoolBase._init_pool
File "_catboost.pyx", line 3597, in _catboost._PoolBase._init_pool
File "_catboost.pyx", line 3438, in _catboost._PoolBase._init_features_order_layout_pool
File "_catboost.pyx", line 2477, in _catboost._set_features_order_data_pd_data_frame
File "_catboost.pyx", line 2021, in _catboost.create_num_factor_data
File "_catboost.pyx", line 1982, in _catboost.get_float_feature
_catboost.CatBoostError: Bad value for num_feature[non_default_doc_idx=0,feature_idx=0]="COLUMN_ONE": Cannot convert 'b'COLUMN_ONE'' to float
Just ran into this issue during the longer model test, any ideas what could be causing it? The other models ran fine with optuna, only catboost seems to be doing this.
@Possums that might be some problem with the categorical column "COLUMN_ONE" in the data? CatBoost is using categoricals without encoding into numbers. I've checked the CatBoost repo and found something like this: https://github.com/catboost/catboost/issues/934
Could you send me data from this column? Maybe you could replace original values with some anonymized ones? Or try to simulate similar data so I can try to reproduce the problem?
Actually, I just realized didn't anonymize the output correctly. COLUMN_ONE is actually a data point of period_1 in the column time_period, which like you said is a categorical column with strings ranging from period_1 to period_3000. As such, the conversion error appears to be with the specific values in the column rather than the column name.
Example data would be "period_1", "period_2", "period_3". This column isn't actually that important to me so I'll probably end up dropping it, but it would be good to have a way to specify categorial variables.
@Possums MLJAR AutoML automatically detects categorical features. For other than CatBoost algorithms the conversion of categorical into numbers is applied. In your results_path
directory there should be data_info.json
file - you can check details about your data there.
I would rather bet that there might be some problem with CatBoost ...
Got it, checked the data_info.json file and indeed the column is marked as categorical.
You can try to optimize only CatBoost and upgrade CatBoost to the latest version 0.25
, MLJAR is using 0.24.4
(I didn't update it yet) - maybe this will help ...
The other option is to create minimum reproducible code and try to catch the bug.
@Possums I'm closing this issue. Thank you for all help and feedback. If you will have problems please add new issue.
Currently, r2 metric evaluation is not supported in the tuner/optuna/tuner.py file.
if eval_metric.name not in ["auc", "logloss", "rmse", "mae", "mape"]: raise AutoMLException(f"Metric {eval_metric.name} is not supported")
When I manually add 'r2' to the list, I encounter the following error.
Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 1054, in _fit trained = self.train_model(params) File "/usr/local/lib/python3.8/dist-packages/supervised/base_automl.py", line 356, in train_model mf.train(results_path, model_subpath) File "/usr/local/lib/python3.8/dist-packages/supervised/model_framework.py", line 185, in train self.learner_params = optuna_tuner.optimize( File "/usr/local/lib/python3.8/dist-packages/supervised/tuner/optuna/tuner.py", line 106, in optimize objective = LightgbmObjective( File "/usr/local/lib/python3.8/dist-packages/supervised/tuner/optuna/lightgbm.py", line 61, in __init__ self.eval_metric_name = metric_name_mapping[ml_task][self.eval_metric.name] KeyError: 'r2'
Is this a known limitation, and if so, is there a way to work around it?