Closed shabir1 closed 2 years ago
Hi @shabir1,
While you may want to exclude algorithms that hit a memory limit, this is usually indicative of not assigning enough memory for your data, one thing that will be emphasized in our next release of documentation is that memory_limit
is per job specific in n_jobs
, i.e. 8gb with 2 jobs means autosklearn will be allowed to use a maximum 16gb of memory.
You can find the status of individual configurations with:
automl.leaderboard(detailed=True, ensemble_only=False)
@eddiebergman Below is my model's sprint_statistics
Metric: accuracy\n Best validation score: 0.050159\n
Number of target algorithm runs: 21\n
Number of successful target algorithm runs: 1\n
Number of crashed target algorithm runs: 2\n
Number of target algorithms that exceeded the time limit: 9\n
Number of target algorithms that exceeded the memory limit: 9\n'
Now I want to know those algorithms which are exceeded the time/memory limit. Is there any way to know this?
@eddiebergman And thank you for your fast response :+1:
The leaderboard
should show the type of algorithm along with it's status
@eddiebergman I tried leaderboard but it gives me only success algorithm status only failed ones are not there.
autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=3600,
memory_limit=3072,
ensemble_size=10,
ensemble_nbest=10,
max_models_on_disc=10
)
automl.sprint_statistics()
auto-sklearn results:\n Dataset name: 63b6593f-5043-11ec-9569-35bb9d3e2abc\n Metric: accuracy\n Best validation score: 0.999174\n Number of target algorithm runs: 17\n Number of successful target algorithm runs: 4\n Number of crashed target algorithm runs: 2\n Number of target algorithms that exceeded the time limit: 9\n Number of target algorithms that exceeded the memory limit: 2\n
Then I did:
a = automl.leaderboard(detailed=True, ensemble_only=False)
a.head(100)
rank ensemble_weight type cost duration config_id train_loss seed start_time end_time budget status data_preprocessors feature_preprocessors balancing_strategy config_origin
model_id
11 1 1.0 adaboost 0.000826 134.152884 10 0.000820 0 1.638103e+09 1.638103e+09 0.0 StatusType.SUCCESS [] [select_rates_classification] none Initial design
10 2 0.0 mlp 0.000840 175.563798 9 0.000840 0 1.638103e+09 1.638103e+09 0.0 StatusType.SUCCESS [] [feature_agglomeration] none Initial design
12 3 0.0 lda 0.000840 54.700491 11 0.000840 0 1.638103e+09 1.638103e+09 0.0 StatusType.SUCCESS [] [liblinear_svc_preprocessor] weighting Initial design
17 4 0.0 gaussian_nb 0.012376 7.900610 16 0.012351 0 1.638104e+09 1.638104e+09 0.0 StatusType.SUCCESS [] [feature_agglomeration] none Initial design
Hmm I'm not sure why they don't show up but you can check out this example to see if it helps you. We have some work going on to redo the backend which will then lead to better interfaces for these kinds of things but for now I suggest checking out this example, particularly the runhistory
section to see what you can get out of it, apologies.
@eddiebergman I tried
for run_key in automl.automl_.runhistory_.data:
print('#########')
print(run_key)
print(automl.automl_.runhistory_.data[run_key])
Output:
#########
RunKey(config_id=1, instance_id='{"task_id": "6b9ec16e-50ed-11ec-b47c-11a7ef6ab556"}', seed=0, budget=0.0)
RunValue(cost=1.0, time=360.01602506637573, status=<StatusType.TIMEOUT: 2>, starttime=1638174126.4998682, endtime=1638174487.551495, additional_info={'error': 'Timeout', 'configuration_origin': 'Initial design'})
#########
RunKey(config_id=2, instance_id='{"task_id": "6b9ec16e-50ed-11ec-b47c-11a7ef6ab556"}', seed=0, budget=0.0)
RunValue(cost=1.0, time=6.928538799285889, status=<StatusType.CRASHED: 3>, starttime=1638174488.1645942, endtime=1638174496.1007042, additional_info={'error': 'Result queue is empty', 'exit_status': "<class 'pynisher.limit_function_call.AnythingException'>", 'subprocess_stdout': '', 'subprocess_stderr': '', 'exitcode': -11, 'configuration_origin': 'Initial design'})
#########
RunKey(config_id=3, instance_id='{"task_id": "6b9ec16e-50ed-11ec-b47c-11a7ef6ab556"}', seed=0, budget=0.0)
RunValue(cost=1.0, time=360.1103603839874, status=<StatusType.TIMEOUT: 2>, starttime=1638174496.29268, endtime=1638174857.4363446, additional_info={'error': 'Timeout', 'configuration_origin': 'Initial design'})
#########
and so on ...
But still, there is no information about the Algorithm, Which algorithm failed due to TIMEOUT, CRASHED, ... etc.
This came up in a meeting today and leaderboard
will be updated to include all config_id
's, even the ones that crashed and timed out (something I thought it already did)
It show's the config_id
of which algorithm crashed, to get the config (the algorithm pipeline) that caused the crash/timeout it's a bit more involved.
Here's a reproducible example:
import pickle
import os
from smac.tae import StatusType
import sklearn.datasets
import sklearn.metrics
from autosklearn.classification import AutoSklearnClassifier
if os.path.exists("model"):
with open("model", "rb") as f:
automl = pickle.load(f)
else:
X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
automl = AutoSklearnClassifier(time_left_for_this_task=30, seed=1)
automl.fit(X, y)
with open("model", "wb") as f:
pickle.dump(automl, f)
print(automl.sprint_statistics())
configs = automl.automl_.runhistory_.ids_config
history = automl.automl_.runhistory_.data
failed_ids = []
for rkey, rval in history.items():
# STOP indicates the end of the optimization loop happened
if rval.status not in [StatusType.SUCCESS, StatusType.STOP]:
failed_ids.append(rkey.config_id)
for id in failed_ids:
print(configs[id])
Here's the output:
auto-sklearn results:
Dataset name: d78ebf29-5119-11ec-9c2e-f47b09df72c1
Metric: accuracy
Best validation score: 0.973404
Number of target algorithm runs: 14
Number of successful target algorithm runs: 13
Number of crashed target algorithm runs: 0
Number of target algorithms that exceeded the time limit: 1
Number of target algorithms that exceeded the memory limit: 0
Configuration:
balancing:strategy, Value: 'weighting'
classifier:__choice__, Value: 'extra_trees'
classifier:extra_trees:bootstrap, Value: 'False'
classifier:extra_trees:criterion, Value: 'entropy'
classifier:extra_trees:max_depth, Constant: 'None'
classifier:extra_trees:max_features, Value: 0.993803313878608
classifier:extra_trees:max_leaf_nodes, Constant: 'None'
classifier:extra_trees:min_impurity_decrease, Constant: 0.0
classifier:extra_trees:min_samples_leaf, Value: 2
classifier:extra_trees:min_samples_split, Value: 20
classifier:extra_trees:min_weight_fraction_leaf, Constant: 0.0
data_preprocessor:__choice__, Value: 'feature_type'
data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'no_encoding'
data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'minority_coalescer'
data_preprocessor:feature_type:categorical_transformer:category_coalescence:minority_coalescer:minimum_fraction, Value: 0.41826215858914706
data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'median'
data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'robust_scaler'
data_preprocessor:feature_type:numerical_transformer:rescaling:robust_scaler:q_max, Value: 0.7305615609807856
data_preprocessor:feature_type:numerical_transformer:rescaling:robust_scaler:q_min, Value: 0.25595970768123566
feature_preprocessor:__choice__, Value: 'polynomial'
feature_preprocessor:polynomial:degree, Value: 2
feature_preprocessor:polynomial:include_bias, Value: 'True'
feature_preprocessor:polynomial:interaction_only, Value: 'True'
How to know which algorithm/configurations are failed due to memory limit / Time limit?
So next time I will exclude those algorithms.