automl / auto-sklearn

Automated Machine Learning with scikit-learn
https://automl.github.io/auto-sklearn
BSD 3-Clause "New" or "Revised" License
7.55k stars 1.28k forks source link

threads limit "31199 current, 31199 max" after using fit function 2 times #1674

Open eyalElb opened 1 year ago

eyalElb commented 1 year ago

Describe the bug

after running this twice in a row:

cls = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=2*60,
    per_run_time_limit=30,
    n_jobs=1,
    include = {
                'classifier': ["extra_trees"]
              },
    initial_configurations_via_metalearning=0
)
cls.fit(X_train, y_train)

cls2 = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=2*60,
    per_run_time_limit=30,
    n_jobs=1,
    include = {
                'classifier': ["mlp"]
              },
    initial_configurations_via_metalearning=0
)
cls2.fit(X_train, y_train, dataset_name="breast_cancer")

It crushes

To Reproduce

Steps to reproduce the behavior:

import sklearn.datasets
import sklearn.metrics
import sklearn
import autosklearn.classification
import autosklearn

X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
    X, y, random_state=1
)

automl = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=2*60,
    per_run_time_limit=30,
)
automl.fit(X_train, y_train, dataset_name="breast_cancer")
print(automl.leaderboard())
cls = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=2*60,
    per_run_time_limit=30,
    n_jobs=1,
    include = {
                'classifier': ["extra_trees"]
              },
    initial_configurations_via_metalearning=0
)
cls.fit(X_train, y_train, dataset_name="breast_cancer")
cls2 = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=2*60,
    per_run_time_limit=30,
    n_jobs=1,
    include = {
                'classifier': ["mlp"]
              },
    initial_configurations_via_metalearning=0
)
cls2.fit(X_train, y_train, dataset_name="breast_cancer")

Then after few seconds it crushes and stop. then i get the error below.

Expected behavior

not to crush.

Actual behavior, stacktrace or logfile

OpenBLAS blas_thread_init: pthread_create failed for thread 17 of 20: Resource temporarily unavailable OpenBLAS blas_thread_init: RLIMIT_NPROC 31199 current, 31199 max OpenBLAS blas_thread_init: pthread_create failed for thread 18 of 20: Resource temporarily unavailable OpenBLAS blas_thread_init: RLIMIT_NPROC 31199 current, 31199 max OpenBLAS blas_thread_init: pthread_create failed for thread 19 of 20: Resource temporarily unavailable OpenBLAS blas_thread_init: RLIMIT_NPROC 31199 current, 31199 max [ERROR] [2023-07-10 23:17:55,578:Client-AutoML(1):d9dfbf18-1f5e-11ee-b1c9-45401a5be18e] (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\nTraceback (most recent call last):\n File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\n self.run()\n File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run\n self._target(*self._args, *self._kwargs)\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 133, in subprocess_func\n return_value = ((func(args, kwargs), 0))\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/autosklearn/evaluation/init.py", line 55, in fit_predict_try_except_decorator\n return ta(queue=queue, kwargs)\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/autosklearn/evaluation/train_evaluator.py", line 1191, in eval_holdout\n evaluator = TrainEvaluator(\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/autosklearn/evaluation/train_evaluator.py", line 206, in init\n super().init(\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/autosklearn/evaluation/abstract_evaluator.py", line 215, in init\n threadpool_limits(limits=1)\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/threadpoolctl.py", line 373, in init\n super().init(ThreadpoolController(), limits=limits, user_api=user_api)\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/threadpoolctl.py", line 166, in init\n self._set_threadpool_limits()\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/threadpoolctl.py", line 299, in _set_threadpool_limits\n lib_controller.set_num_threads(num_threads)\n File "/home/eyal/SubStrat/lib/python3.10/site-packages/threadpoolctl.py", line 865, in set_num_threads\n return set_func(num_threads)\nKeyboardInterrupt\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)

Environment and installation:

Please give details about your installation:

AnaMiguelRodrigues1 commented 1 year ago

I am trying to use AutoSklearn, and I also had a similar error. But it was interesting cause i run the example of the official documentation on my Data Center Machine with a GPU: Tesla K80 (24GB) and RAM: 128GB. It might be something with the installations. I also do not understand ...