[12:39:46] Stdout logging level is DEBUG.
[12:39:46] Copying TaskTimer may affect the parent PipelineTimer, so copy will create new unlimited TaskTimer
[12:39:46] Task: reg
[12:39:46] Start automl preset with listed constraints:
[12:39:46] - time: 10000000000000.00 seconds
[12:39:46] - CPU: 4 cores
[12:39:46] - memory: 16 GB
[12:39:46] Train data shape: (6019, 14)
[12:39:50] Feats was rejected during automatic roles guess: []
[12:39:50] Layer 1 train process start. Time left 9999999999996.40 secs
[12:39:50] Start fitting Lvl_0_Pipe_0_Mod_0_LinearL2 ...
[12:39:50] Training params: {'tol': 1e-06, 'max_iter': 100, 'cs': [1e-05, 5e-05, 0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100, 500, 1000, 5000, 10000, 50000, 100000], 'early_stopping': 2, 'categorical_idx': [21, 22, 23], 'embed_sizes': array([11, 12, 3], dtype=int32), 'data_size': 24}
[12:39:50] ===== Start working with fold 0 for Lvl_0_Pipe_0_Mod_0_LinearL2 =====
[12:39:50] Linear model: C = 1e-05 score = -128.8645392871719
[12:39:50] Linear model: C = 5e-05 score = -121.64156159218489
[12:39:50] Linear model: C = 0.0001 score = -114.93390055055276
[12:39:50] Linear model: C = 0.0005 score = -89.57974689706185
[12:39:50] Linear model: C = 0.001 score = -77.2702312814782
Traceback (most recent call last):
File "/Users/user/projects/LAML_dev/work/RMSLE_issue.py", line 21, in <module>
main()
File "/Users/user/projects/LAML_dev/work/RMSLE_issue.py", line 18, in main
automl.fit_predict(df, roles=roles, verbose=5)
File "/Users/user/projects/LAML_dev/LightAutoML/lightautoml/automl/presets/tabular_presets.py", line 525, in fit_predict
train, roles=roles, cv_iter=cv_iter, valid_data=valid_data, verbose=verbose
File "/Users/user/projects/LAML_dev/LightAutoML/lightautoml/automl/presets/base.py", line 211, in fit_predict
verbose,
File "/Users/user/projects/LAML_dev/LightAutoML/lightautoml/automl/base.py", line 225, in fit_predict
pipe_pred = ml_pipe.fit_predict(train_valid)
File "/Users/user/projects/LAML_dev/LightAutoML/lightautoml/pipelines/ml/base.py", line 150, in fit_predict
), "Pipeline finished with 0 models for some reason.\nProbably one or more models failed"
AssertionError: Pipeline finished with 0 models for some reason.
Probably one or more models failed
[12:39:50] Model Lvl_0_Pipe_0_Mod_0_LinearL2 failed during ml_algo.fit_predict call.
Input contains NaN, infinity or a value too large for dtype('float32').
It turned out that the package fails when I use "Used Card Price" dataset and RMSLE loss function (see the error stack below).
After long investigation I can conclude that:
I suppose that the problem is unexpected behavior of LBFGS solver with non-smooth loss function and propose several solutions:
So far, I open PR to catch the pipeline failure.