Extending auto-sklearn with a regression component and run (n_jobs > 1) Crash

fabricekfr commented 3 years ago

Describe the bug

I have a problem when I extend auto-sklearn with a new regression and run it in parallel (n_jobs > 1). I get this error 'Trying to include unknown component: KernelRidgeRegression' even if I guard the code invoking Auto-sklearn by “ if name == 'main' ”

To Reproduce

Steps to reproduce the behavior:

Copy the example Extending Auto-Sklearn with Regression Component from auto-sklearn 0.11.1Examples Page
Guard the code by “ if name == 'main' ” before the line reg = autosklearn.regression.AutoSklearnRegressor
Add parameter n_jobs=2 to reg = autosklearn.regression.AutoSklearnRegressor

the code will look like

if __name__ == '__main__':
reg = autosklearn.regression.AutoSklearnRegressor(
  time_left_for_this_task=30,
  per_run_time_limit=10,
  include_estimators=['KernelRidgeRegression'],
  # Bellow two flags are provided to speed up calculations
  # Not recommended for a real implementation
  initial_configurations_via_metalearning=0,
  smac_scenario_args={'runcount_limit': 5},
  n_jobs=2
)
reg.fit(X_train, y_train)

############################################################################
# Print prediction score and statistics
# =====================================
y_pred = reg.predict(X_test)
print("r2 score: ", sklearn.metrics.r2_score(y_pred, y_test))
print(reg.show_models())

Expected behavior

I will not suppose to get MyDummyRegressor as best model

Actual behavior, stacktrace or logfile

I have this error in logs [DEBUG] [2020-11-23 17:35:52,888:AutoMLSMBO(1)::ee1535a037e7bc278ee41205ec071bdb] Finished function evaluation. Status: StatusType.CRASHED, Cost: 1.000000, Runtime: 1.476809, Additional {'traceback': 'Traceback (most recent call last):\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/evaluation/__init__.py", line 31, in fit_predict_try_except_decorator\n return ta(queue=queue, **kwargs)\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/evaluation/train_evaluator.py", line 1075, in eval_holdout\n evaluator.fit_predict_and_loss(iterative=iterative)\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/evaluation/train_evaluator.py", line 465, in fit_predict_and_loss\n add_model_to_self=self.num_cv_folds == 1,\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/evaluation/train_evaluator.py", line 798, in _partial_fit_and_predict_standard\n model = self._get_model()\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/evaluation/abstract_evaluator.py", line 221, in _get_model\n init_params=self._init_params)\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/pipeline/regression.py", line 76, in __init__\n init_params=init_params)\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/pipeline/base.py", line 35, in __init__\n self.config_space = self.get_hyperparameter_search_space()\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/pipeline/base.py", line 228, in get_hyperparameter_search_space\n dataset_properties=self.dataset_properties)\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/pipeline/regression.py", line 151, in _get_hyperparameter_search_space\n exclude=exclude, include=include, pipeline=self.steps)\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/pipeline/base.py", line 306, in _get_base_search_space\n pipeline, dataset_properties, include=include, exclude=exclude)\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/pipeline/create_searchspace_util.py", line 35, in get_match_array\n exclude=node_exclude).keys()))\n File "/usr/local/lib/python3.6/dist-packages/autosklearn/pipeline/components/regression/__init__.py", line 47, in get_available_components\n "%s" % incl)\nValueError: Trying to include unknown component: KernelRidgeRegression\n', 'error': "ValueError('Trying to include unknown component: KernelRidgeRegression',)", 'configuration_origin': 'Random Search'}

Environment and installation:

google colab

Python version: ['3.6.9 (default, Oct 8 2020, 12:12:24) ', '[GCC 8.4.0]'] Distribution: ('Ubuntu', '18.04', 'bionic') System: Linux Machine: x86_64 Platform: Linux-4.19.112+-x86_64-with-Ubuntu-18.04-bionic Version: #1 SMP Thu Jul 23 08:00:38 PDT 2020 setuptools 50.3.2 numpy 1.18.5 scipy 1.4.1 joblib 0.17.0 scikit-learn 0.22.2.post1 dask 2.30.0 distributed 2.30.1 lockfile 0.12.2 PyYAML 3.13 pandas 1.1.4 liac-arff 2.5.0 ConfigSpace 0.4.16 pynisher 0.6.2 pyrfr 0.8.0 AutoML(1)_f6b2033865f8542d2e894f50390c26a4.log

mfeurer commented 3 years ago

Thanks a lot for reporting this. Unfortunately, this must have happened when we moved from using "fork" contexts to "spawn" context when creating subprocesses. We'll need to have a closer look into what's going wrong hera as I don't think there will be an easy fix for this.

franchuterivera commented 3 years ago

@mfeurer do you think this is still required? Seems like moving to forkserver/threads solved this issue as the server has this classes preloaded? For example, I did a run moving to 2 jobs on this extending example and I no longer see the issue, or at least I was not able to reproduce here https://github.com/franchuterivera/auto-sklearn/runs/1744657845?check_suite_focus=true

franchuterivera commented 3 years ago

I believe this problem is solved now in the development branch. Because of that, I am closing this problem.

Thanks a lot for reporting the problem.

automl / auto-sklearn