automl / auto-sklearn

Automated Machine Learning with scikit-learn
https://automl.github.io/auto-sklearn
BSD 3-Clause "New" or "Revised" License
7.62k stars 1.28k forks source link

Getting out of memory from Dummy prediction no matter how much memory is allocated. #978

Closed shihgianlee closed 2 years ago

shihgianlee commented 4 years ago

Hello,

I am running auto-sklearn on a Google Cloud machine in Jupyter. I keep getting the following out of memory error no matter how much memory I assigned to ml_memory_limit. The following is the error message I am getting:

ValueError: Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 5000 MB).', 'configuration_origin': 'DUMMY'}.

The following is my initialization code:

import autosklearn.classification

automl = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=300,
    per_run_time_limit=30,
    ml_memory_limit=5000,
    ensemble_size=0,
    include_preprocessors=["no_preprocessing"])

automl.fit(X_train.values, y_train.index.values)

The X_train has 400K rows with 5 columns of data. The y_train is a vector with 400K rows of data. I am using auto-sklearn==0.10.0. I have been adjusting the ml_memory_limit beyond 5000 MB but the program returned pretty quickly with the same error. The ml_memory_limit doesn't seem to be honored. I have tried the suggestions in issue#520 but to no avail.

I tried to run the following example in the Jupyter notebook to make sure I am using the library correctly:

import autosklearn.classification
import sklearn.datasets
import sklearn.metrics

X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = \
    sklearn.model_selection.train_test_split(X, y, random_state=1)

automl = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=120,
    per_run_time_limit=30
)
automl.fit(X_train, y_train, dataset_name='breast_cancer')

It finished training successfully.

I would appreciate any help from the community!

Environment:

Python version: 3.7.8
Scikit-learn version: 0.22.2.post1
OS: Debian 9
auto-sklearn: 0.10.0
shihgianlee commented 4 years ago

It turned out that I didn't assign enough memory to it. The ml_memory_limit does work. Even after I increased the memory limit to > 600GB, I still could not make it very far until I get the error again.

automl = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=3600,
    per_run_time_limit=1900,
    ml_memory_limit=600*1024,
    ensemble_size=1,
    ensemble_memory_limit=7*1024,
    initial_configurations_via_metalearning=0,
    include_preprocessors=["no_preprocessing"],
    tmp_folder='./tmp/',
    output_folder='./out/',
    delete_output_folder_after_terminate=False,
    delete_tmp_folder_after_terminate=False,

I am surprised that auto-sklearn consumes so much memory for 400K rows of data. A single XGBoost instance can finish training pretty quickly on a medium machine. I can see the value of auto-sklearn. But, it is discouraging that it requires so much memory for not so large dataset.

I would like to give it another try if someone can point out how I can save some memory or if I am doing something wrong.

mfeurer commented 4 years ago

Hi @shihgianlee thanks a lot for reporting this issue. I'm really unsure why this happens as 6GB for 400k instances sounds sufficient.

Two steps to move forward:

Also, out of curiosity, how many attributes does your dataset have?

shihgianlee commented 4 years ago

Hi @mfeurer If I remembered correctly, I subsampled 5K rows of data and used 10 GB memory. It didn't throw memory error but was taking a long time to complete. I gave up waiting after an hour, if I remembered correctly. I only have 5 attributes.

ach4l commented 3 years ago

Hello @mfeurer. I am facing the same issue while tunning autosklearn on kaggle. the dataset is only 2.2 GB. About 400k rows as well, but only 4 columns. Locally I have seen sklearn handle bigger datasets with less memory. Dont know if this is a cloud related issue.

eddiebergman commented 3 years ago

Hi @shihgianlee , @ach4l,

Sorry it's been a while, but to clarify, it seems these issues only happen in cloud based infrastructure like GCP and Kaggle? Do these issues also happen locally?

While we don't test on cloud infrastructure beyond unittesting on Github's actions, it would be interesting to find out what the root cause of these memory issues is.

f-istvan commented 3 years ago

Hi @shihgianlee , @ach4l, @shihgianlee I faced a very similar issue. Here are the deatils:

Dataset:

Size: 197,9 MB
Columns: 89
Rows: 501808 

Init params:

automl = autosklearn.regression.AutoSklearnRegressor(
    time_left_for_this_task=3600,
    per_run_time_limit=360,
    memory_limit=27000
)

df_cv_results

    mean_test_score  mean_fit_time                                             params  rank_test_scores   status  budgets  ... param_regressor:libsvm_svr:gamma param_regressor:mlp:validation_fraction param_regressor:sgd:epsilon param_regressor:sgd:eta0 param_regressor:sgd:l1_ratio param_regressor:sgd:power_t
1          0.001069     206.981234  {'data_preprocessing:categorical_transformer:c...                 1  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
12         0.000141      21.207443  {'data_preprocessing:categorical_transformer:c...                 2  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
7          0.000014      12.729235  {'data_preprocessing:categorical_transformer:c...                 3  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
0          0.000000     360.100346  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
15         0.000000       9.028285  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
25         0.000000       4.793600  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
24         0.000000       6.720857  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
23         0.000000     360.019049  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
22         0.000000      31.379792  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
21         0.000000      16.599984  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     0.1                         NaN                      NaN                          NaN                         NaN
20         0.000000     360.116118  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
19         0.000000       9.361809  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
18         0.000000       5.814345  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
17         0.000000     360.115730  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                         0.032332                                     NaN                         NaN                      NaN                          NaN                         NaN
16         0.000000       6.615313  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
14         0.000000     360.080400  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
13         0.000000     360.043842  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
11         0.000000       6.372612  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
9          0.000000      19.444851  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
8          0.000000       8.804391  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
6          0.000000     360.117929  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
5          0.000000       8.347663  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
3          0.000000       5.497032  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
2          0.000000       5.126574  {'data_preprocessing:categorical_transformer:c...                 4   Memout      0.0  ...                         0.002623                                     NaN                         NaN                      NaN                          NaN                         NaN
27         0.000000     217.114209  {'data_preprocessing:categorical_transformer:c...                 4  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
4         -0.002910      62.450895  {'data_preprocessing:categorical_transformer:c...                26  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
26        -0.007268      20.792192  {'data_preprocessing:categorical_transformer:c...                27  Success      0.0  ...                              NaN                                     NaN                    0.000047                      NaN                     0.018917                         NaN
10      -456.128305     257.670972  {'data_preprocessing:categorical_transformer:c...                28  Success      0.0  ...                              NaN                                     0.1                         NaN                      NaN                          NaN                         NaN

automl.leaderboard

          rank  ensemble_weight               type        cost    duration  config_id  train_loss  seed    start_time      end_time  budget              status                                 data_preprocessors                 feature_preprocessors balancing_strategy           config_origin
model_id                                                                                                                                                                                                                                                                                          
3            1             0.68  gradient_boosting    0.998931  206.981234          2    0.950685     0  1.631515e+09  1.631515e+09     0.0  StatusType.SUCCESS           [one_hot_encoding, no_coalescense, none]             [select_rates_regression]               None          Initial design
14           2             0.32  gradient_boosting    0.999859   21.207443         13    0.999518     0  1.631516e+09  1.631516e+09     0.0  StatusType.SUCCESS  [one_hot_encoding, minority_coalescer, robust_...                    [no_preprocessing]               None          Initial design
9            3             0.00  gradient_boosting    0.999986   12.729235          8    0.999978     0  1.631516e+09  1.631516e+09     0.0  StatusType.SUCCESS     [one_hot_encoding, minority_coalescer, minmax]             [select_rates_regression]               None          Initial design
6            4             0.00  gradient_boosting    1.002910   62.450895          5    0.964131     0  1.631515e+09  1.631515e+09     0.0  StatusType.SUCCESS       [no_encoding, no_coalescense, robust_scaler]               [feature_agglomeration]               None          Initial design
28           5             0.00                sgd    1.007268   20.792192         27    1.007495     0  1.631518e+09  1.631518e+09     0.0  StatusType.SUCCESS           [one_hot_encoding, no_coalescense, none]             [select_rates_regression]               None  Random Search (sorted)
12           6             0.00                mlp  457.128305  257.670972         11    0.981957     0  1.631516e+09  1.631516e+09     0.0  StatusType.SUCCESS    [one_hot_encoding, no_coalescense, standardize]  [extra_trees_preproc_for_regression]               None          Initial design

My system and versions:

This machine runs in a VirtualBox. Host: Windows Guest: Linux

auto-sklearn = "==0.13.0"
python_version = "3.7"

$ uname -a

Linux i 5.4.0-58-generic #64-Ubuntu SMP Wed Dec 9 08:16:25 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ lscpu

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          14
On-line CPU(s) list:             0-13
Thread(s) per core:              1
Core(s) per socket:              14
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           158
Model name:                      Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
Stepping:                        13
CPU MHz:                         3600.006
BogoMIPS:                        7200.01
Hypervisor vendor:               KVM
Virtualization type:             full
L1d cache:                       448 KiB
L1i cache:                       448 KiB
L2 cache:                        3,5 MiB
L3 cache:                        224 MiB

I see lots of MEMOUTS while the memory_limit is 27000. I'm I doing something wrong?

I have a Linux laptop as well so I will try the same run on linux without any kind of virtualizations and post my findings here.

f-istvan commented 3 years ago

Hi @eddiebergman , @mfeurer I tested this on the other physical machine I have. I run into the same issue on that Linux machine with no virtualization at all. Could you please take a look and check what I'm doing wrong. I'm also happy to have a call and show the issue if needed.

Otherwise I won't be able to use this lib and have to switch to something else.

Regards, Stefan

eddiebergman commented 3 years ago

Hi @f-istvan,

Sorry for the delay. I can't immediately see anything wrong with your setup although one thing in general I would recommend is to utilize more of your available cores if the memout issues were to be fixed.

For some context, the fact that so many memouts occur indicates to me a few possible reasons:

Diagnosing those issues can be done if you post the output of df_cv_results['params'] as this essentially contains the high level model definition that was tried with SMAC (our underlying optimizer).

Do the same issues appear at smaller timescales? i.e. 600s total time and 60s per model?

If you could provide this extra information, hopefully that will be enough to diagnose it

bill-95 commented 3 years ago

@f-istvan,

Did you try setting memory_limit=None?

f-istvan commented 3 years ago

Hi,

Sorry for the late response. First of all, here is a full example with generated training data with results:

import numpy as np
import pandas as pd
import autosklearn.regression

value_set = [0.0, 0.25, 0.5, 0.75, 1.0]

col = 89
row = 501808
training_data = np.random.choice(value_set, col * row).reshape(row, col)

df = pd.DataFrame(data=training_data)
print(df)

automl = autosklearn.regression.AutoSklearnRegressor(
    time_left_for_this_task=3600,
    per_run_time_limit=360,
    memory_limit=27000
)

col = 1
row = 501808
target = np.random.choice(value_set, col * row).reshape(row, col)

print('start fit')
automl.fit(training_data, target, dataset_name='github_issue')
print('end fit')

df_cv_results = pd.DataFrame(automl.cv_results_).sort_values(by = 'mean_test_score', ascending = False)

print('df_cv_results')
print(df_cv_results)

print('automl.leaderboard')
print(automl.leaderboard(detailed = True, ensemble_only=False))

print('automl.get_models_with_weights')
print(automl.get_models_with_weights())

print('automl.sprint_statistics')
print(automl.sprint_statistics())

Output:

         0     1     2     3     4     5     6     7     8     9     10    11    12    13    14    15    16    17    18    19    20    21    22    23    24  ...    64    65    66    67    68    69    70    71    72    73    74    75    76    77    78    79    80    81    82    83    84    85    86    87    88
0       0.00  0.25  0.50  0.00  0.25  0.50  0.25  0.00  0.00  0.25  1.00  0.00  1.00  1.00  1.00  0.00  0.25  0.50  0.50  0.25  0.50  0.75  0.00  1.00  1.00  ...  0.75  0.75  0.25  0.00  0.50  0.50  0.50  0.50  0.75  0.75  0.75  0.25  1.00  1.00  1.00  1.00  0.25  0.50  0.25  0.50  0.75  0.75  1.00  0.75  0.00
1       0.25  1.00  0.25  0.00  0.75  0.25  0.50  0.00  0.50  0.50  0.25  0.50  0.00  0.50  0.25  0.50  0.75  0.75  0.75  0.25  0.00  0.25  1.00  0.00  0.50  ...  1.00  1.00  0.00  1.00  0.25  0.75  0.50  1.00  0.25  1.00  1.00  0.50  0.50  0.75  0.25  0.00  0.75  0.75  1.00  1.00  0.00  0.00  0.25  0.50  0.75
2       0.75  0.25  0.00  1.00  0.50  0.50  0.25  0.50  0.75  0.25  0.50  0.25  0.50  0.75  0.25  0.25  0.00  0.75  0.00  0.50  0.50  0.25  0.75  0.75  0.75  ...  0.25  0.25  0.25  1.00  0.25  0.75  0.75  0.00  0.75  0.25  0.25  0.25  1.00  0.50  0.75  0.50  0.25  0.25  0.25  0.00  0.00  0.50  1.00  0.50  0.25
3       0.00  0.50  0.25  0.25  0.50  0.75  0.50  0.25  0.00  0.75  0.50  0.50  0.25  1.00  0.00  0.75  0.00  0.50  0.50  0.75  0.00  0.75  0.50  0.50  0.75  ...  0.00  0.00  0.25  0.25  0.50  0.75  0.75  0.00  0.00  0.00  0.25  0.25  0.50  0.25  0.25  0.00  0.75  0.50  0.00  0.50  0.75  0.25  0.50  1.00  0.50
4       0.50  0.50  1.00  0.25  0.50  0.25  0.50  0.75  0.25  0.00  1.00  0.75  0.50  0.25  0.50  1.00  0.00  1.00  0.25  0.25  0.25  0.00  0.25  1.00  0.75  ...  0.75  1.00  0.25  0.75  0.50  1.00  0.50  0.75  1.00  0.75  0.00  0.25  0.25  0.25  0.75  1.00  0.00  0.00  0.00  0.00  0.50  0.00  0.50  0.25  0.00
...      ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...  ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...   ...
501803  0.00  0.75  0.25  0.75  0.00  0.00  0.00  0.00  0.25  1.00  0.25  0.00  1.00  0.00  1.00  0.00  1.00  0.75  0.75  0.00  0.25  1.00  1.00  0.00  0.50  ...  1.00  1.00  0.50  0.75  1.00  0.25  1.00  0.25  0.75  1.00  0.25  1.00  0.00  0.00  0.25  1.00  1.00  0.00  0.00  0.75  0.00  0.50  0.25  0.50  0.75
501804  0.50  0.50  1.00  1.00  0.00  1.00  0.50  0.00  0.00  1.00  0.00  1.00  1.00  1.00  0.25  0.75  0.50  0.75  0.25  0.50  0.50  0.00  0.00  0.50  0.25  ...  0.75  1.00  0.00  1.00  0.00  0.75  0.00  0.25  1.00  0.25  0.00  0.50  1.00  0.50  1.00  0.25  0.25  0.00  0.00  0.25  0.75  0.25  1.00  0.50  1.00
501805  0.75  0.75  0.25  0.50  1.00  0.25  0.00  0.25  0.00  0.50  1.00  0.25  0.00  0.25  1.00  0.50  0.25  0.75  1.00  0.25  0.50  0.75  0.00  0.00  1.00  ...  0.00  0.50  0.25  0.00  0.00  0.00  0.25  0.00  0.50  0.25  1.00  1.00  0.50  0.25  0.00  1.00  0.75  0.25  0.00  0.50  0.00  1.00  1.00  0.00  0.75
501806  0.25  0.25  0.75  0.75  0.75  0.00  0.50  0.75  0.25  0.50  0.25  0.25  0.50  0.00  0.75  0.50  0.50  0.75  1.00  0.00  1.00  0.25  0.00  0.25  0.50  ...  0.50  0.50  1.00  1.00  1.00  1.00  1.00  0.25  1.00  0.75  0.75  0.25  0.75  0.00  0.50  0.00  0.00  1.00  0.50  0.75  0.75  1.00  0.00  0.50  1.00
501807  0.25  1.00  0.50  0.25  0.00  0.75  1.00  0.50  0.75  0.00  0.00  0.75  0.50  0.00  0.25  1.00  0.50  0.00  0.25  0.75  0.00  0.00  0.00  0.00  0.75  ...  0.75  0.00  0.75  0.25  0.75  1.00  0.75  0.50  0.00  1.00  0.25  0.00  0.25  0.75  0.75  0.50  0.25  0.75  0.00  0.00  0.25  0.25  0.75  1.00  1.00

[501808 rows x 89 columns]
start fit
[WARNING] [2021-09-20 21:00:20,620:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 1. Number of dummy models: 1
[WARNING] [2021-09-20 21:06:22,006:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 1. Number of dummy models: 1
[WARNING] [2021-09-20 21:08:59,789:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 2. Number of dummy models: 1
[WARNING] [2021-09-20 21:09:03,401:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 2. Number of dummy models: 1
[WARNING] [2021-09-20 21:15:04,783:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 2. Number of dummy models: 1
[WARNING] [2021-09-20 21:21:06,217:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 2. Number of dummy models: 1
[WARNING] [2021-09-20 21:21:40,158:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 3. Number of dummy models: 1
[WARNING] [2021-09-20 21:27:41,444:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 3. Number of dummy models: 1
[WARNING] [2021-09-20 21:27:44,680:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 4. Number of dummy models: 1
[WARNING] [2021-09-20 21:27:48,724:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 4. Number of dummy models: 1
[WARNING] [2021-09-20 21:32:48,108:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 5. Number of dummy models: 1
[WARNING] [2021-09-20 21:32:55,098:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
[WARNING] [2021-09-20 21:33:17,063:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
[WARNING] [2021-09-20 21:39:18,457:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
[WARNING] [2021-09-20 21:39:22,160:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
[WARNING] [2021-09-20 21:45:23,584:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
[WARNING] [2021-09-20 21:51:25,029:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
[WARNING] [2021-09-20 21:51:28,140:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
[WARNING] [2021-09-20 21:53:47,552:Client-EnsembleBuilder] No models better than random - using Dummy loss!Number of models besides current dummy model: 6. Number of dummy models: 1
end fit
df_cv_results
    mean_test_score  mean_fit_time                                             params  rank_test_scores   status  budgets  ... param_regressor:libsvm_svr:gamma param_regressor:mlp:validation_fraction param_regressor:sgd:epsilon param_regressor:sgd:eta0 param_regressor:sgd:l1_ratio param_regressor:sgd:power_t
0          0.000000     360.106616  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
8          0.000000     360.011791  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
18         0.000000       1.826367  {'data_preprocessing:categorical_transformer:c...                 1   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
17         0.000000     360.108457  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                         0.032332                                     NaN                         NaN                      NaN                          NaN                         NaN
16         0.000000     360.104102  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
15         0.000000       2.424781  {'data_preprocessing:categorical_transformer:c...                 1   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
14         0.000000     360.104379  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
13         0.000000      20.691383  {'data_preprocessing:categorical_transformer:c...                 1   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
10         0.000000       2.746277  {'data_preprocessing:categorical_transformer:c...                 1   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
6          0.000000     360.105433  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
5          0.000000     360.104351  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
4          0.000000       2.180909  {'data_preprocessing:categorical_transformer:c...                 1   Memout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
2          0.000000     360.104890  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                         0.002623                                     NaN                         NaN                      NaN                          NaN                         NaN
19         0.000000     138.103932  {'data_preprocessing:categorical_transformer:c...                 1  Timeout      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
3         -0.000005     157.529032  {'data_preprocessing:categorical_transformer:c...                15  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
9         -0.000006       2.959799  {'data_preprocessing:categorical_transformer:c...                16  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
12        -0.000013       6.571774  {'data_preprocessing:categorical_transformer:c...                17  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
1         -0.000317      12.087915  {'data_preprocessing:categorical_transformer:c...                18  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
7         -0.001563      33.671417  {'data_preprocessing:categorical_transformer:c...                19  Success      0.0  ...                              NaN                                     NaN                         NaN                      NaN                          NaN                         NaN
11        -0.002651     299.073780  {'data_preprocessing:categorical_transformer:c...                20  Success      0.0  ...                              NaN                                     0.1                         NaN                      NaN                          NaN                         NaN

[20 rows x 161 columns]
automl.leaderboard
Traceback (most recent call last):
  File "app.py", line 34, in <module>
    print(automl.leaderboard(detailed = True, ensemble_only=False))
  File "/home/i/dev/sources/mytest/.venv/lib/python3.7/site-packages/autosklearn/estimators.py", line 741, in leaderboard
    model_runs[model_id]['ensemble_weight'] = weight
KeyError: 1

@eddiebergman I tried to set n_jobs=2 , 3, 4 up to 8. In all the cases I got a Killed message on my console and the program just stopped. Now based on this stackoverflow question I think this has the same kind of memory issue: https://stackoverflow.com/questions/19189522/what-does-killed-mean-when-a-processing-of-a-huge-csv-with-python-which-sudde

Do the same issues appear at smaller timescales? i.e. 600s total time and 60s per model? -> no, with smaller timescales it finishes successfully. I think 120s total was successful once when I tried to play with this.

Did you try setting memory_limit=None? -> not yet, I will try to do that and post the df_cv_results['params'] too.

Thank you so much!

eddiebergman commented 3 years ago

Hmm so let me address this in a few points:

In general it's quite difficult to allocate resources to do runs as long as your but it seems like it's something we should try testing soon. We appreciate your time and effort and hopefully we can figure this out.

eddiebergman commented 2 years ago

Seeing as there has been no response, we're not sure if this has been solved so closing the issue for now. Feel free to re-open this if anything reoccurs.