Open dadangsetio opened 2 years ago
Hi @dadangsetio, we use sklearn.utils.multiclass.type_of_target
to identify the task type based on the y
you pass in. My guess is that it looks something like [0, 1, 0, 1, 1, ...]
which gets identified as a binary
classification problem. Is this your intended behavior? If so, then I'm not sure we have any way to overwrite this behaviour but I can look into it if it is.
Hi @dadangsetio, we use
sklearn.utils.multiclass.type_of_target
to identify the task type based on they
you pass in. My guess is that it looks something like[0, 1, 0, 1, 1, ...]
which gets identified as abinary
classification problem. Is this your intended behavior? If so, then I'm not sure we have any way to overwrite this behaviour but I can look into it if it is.
thank you for response @eddiebergman you are right that the content of y
is binary, so how can i solve them?
You may prefer to use probability scores from predict_proba
and use a Classifier instead of a Regressor.
If you really need to skip the type_of_target
check then you'll need to use the AutoML
class instead of the AutoSklearnRegresssor
, which is just a fancy wrapper that makes some things simpler, however depending on your use case this should be okay.
Here's a sample snippet:
from sklearn.datasets import make_classification
from autosklearn.automl import AutoML
from autosklearn.constants import REGRESSION
X, y = make_classification()
print(y) # [0, 0, 1, ...]
automl = AutoML(
time_left_for_this_task=30,
per_run_time_limit=5,
...,
)
regressor.fit(X, y, task=REGRESSION, ...)
Here's the __init__(...)
and the fit(...)
calls from AutoML
for you.
Best, Eddie
iam use sample snippet of AutoML
, but getting error like this
[ERROR] [2022-11-07 19:18:21,120:Client-AutoML(1):441115fc-5e96-11ed-acf3-363077345c9d] (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n self.run()\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n self._target(*self._args, **self._kwargs)\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
[ERROR] [2022-11-07 19:18:21,120:Client-AutoML(1):441115fc-5e96-11ed-acf3-363077345c9d] (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n self.run()\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n self._target(*self._args, **self._kwargs)\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
Traceback (most recent call last):
File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py", line 765, in fit
self._do_dummy_prediction()
File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py", line 489, in _do_dummy_prediction
raise ValueError(msg)
ValueError: (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n self.run()\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n self._target(*self._args, **self._kwargs)\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
You should use the same parameters you use when you constructed the estimator as you do in your original code, my guess is you had set the memory_limit=None
.
The issue is that there is no way to limit the memory of processes on Mac as far as I know. See https://github.com/automl/pynisher#features
The above version of pynisher
we use is actually newer and we need to update to it.
classifier = AutoSklearn2Classifier(
time_left_for_this_task=15 * 60,
per_run_time_limit=30,
memory_limit=None,
n_jobs=1,
max_models_on_disc=10,
ensemble_size=10
).fit(preprocessor.transform(train_x), train_y, preprocessor.transform(valid_x), valid_y)
There is an internal check that prohibits running without memory limit:
[ERROR] [2024-07-18 15:19:23,002:Client-AutoML(1):5923f702-4508-11ef-82ea-42442fa1d044] '>' not supported between instances of 'NoneType' and 'int'
Traceback (most recent call last):
File "/Users/Viktor/PycharmProjects/laion-copyright/.venv39/lib/python3.9/site-packages/autosklearn/automl.py", line 680, in fit
X, y = reduce_dataset_size_if_too_large(
File "/Users/Viktor/PycharmProjects/laion-copyright/.venv39/lib/python3.9/site-packages/autosklearn/util/data.py", line 430, in reduce_dataset_size_if_too_large
assert memory_limit > 0
TypeError: '>' not supported between instances of 'NoneType' and 'int'
It's such a shame we cannot use auto-sklearn
on Apple Silicon.. Hopefully one day you find a workaround!
Yes, it's true, I used to feel like that @ViktorooReps
cant fit model with AutoMLRegression
this my log
System Details (if relevant)