automl / auto-sklearn

Automated Machine Learning with scikit-learn
https://automl.github.io/auto-sklearn
BSD 3-Clause "New" or "Revised" License
7.57k stars 1.28k forks source link

Directory Not Empty Error Notebook in JupyterLab enviornment (Docker) #1005

Open CrosbyMonk opened 3 years ago

CrosbyMonk commented 3 years ago

Describe the bug

It appears auto-sklearn want's to delete the python temp dir versus using the provided directories.

Code Snippet

automl = regression.AutoSklearnRegressor(tmp_folder='/users/jihh/automl/auto-sklearn/temp_housing/',output_folder='/users/jihh/automl/auto-sklearn/out_housing/',delete_tmp_folder_after_terminate=False)
automl.fit(X_train, y_train)

Error:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-11-7a52a5d8533a> in <module>
----> 1 automl.fit(X_train, y_train)

~/.local/lib/python3.8/site-packages/autosklearn/estimators.py in fit(self, X, y, X_test, y_test, feat_type, dataset_name)
    719         # Fit is supposed to be idempotent!
    720         # But not if we use share_mode.
--> 721         super().fit(
    722             X=X,
    723             y=y,

~/.local/lib/python3.8/site-packages/autosklearn/estimators.py in fit(self, **kwargs)
    346             output_folder=self.output_folder,
    347         )
--> 348         self.automl_.fit(load_models=self._load_models, **kwargs)
    349 
    350         return self

~/.local/lib/python3.8/site-packages/autosklearn/automl.py in fit(self, X, y, X_test, y_test, feat_type, dataset_name, only_return_configuration_space, load_models)
   1264             self._metric = r2
   1265 
-> 1266         return super().fit(
   1267             X, y,
   1268             X_test=X_test,

~/.local/lib/python3.8/site-packages/autosklearn/automl.py in fit(self, X, y, task, X_test, y_test, feat_type, dataset_name, only_return_configuration_space, load_models)
    531         # == Perform dummy predictions
    532         num_run = 1
--> 533         self._do_dummy_prediction(datamanager, num_run)
    534 
    535         # = Create a searchspace

~/.local/lib/python3.8/site-packages/autosklearn/automl.py in _do_dummy_prediction(self, datamanager, num_run)
    315                                     **self._resampling_strategy_arguments)
    316 
--> 317         status, cost, runtime, additional_info = ta.run(num_run, cutoff=self._time_for_task)
    318         if status == StatusType.SUCCESS:
    319             self._logger.info("Finished creating dummy predictions.")

~/.local/lib/python3.8/site-packages/autosklearn/evaluation/__init__.py in run(self, config, instance, cutoff, seed, budget, instance_specific)
    274 
    275         obj = pynisher.enforce_limits(**arguments)(self.ta)
--> 276         obj(**obj_kwargs)
    277 
    278         if obj.exit_status in (pynisher.TimeoutException,

~/.local/lib/python3.8/site-packages/pynisher/limit_function_call.py in __call__(self2, *args, **kwargs)
    279                             self2.stderr = fh.read()
    280 
--> 281                         tmp_dir.cleanup()
    282 
    283                     # don't leave zombies behind

/opt/conda/lib/python3.8/tempfile.py in cleanup(self)
    829     def cleanup(self):
    830         if self._finalizer.detach():
--> 831             self._rmtree(self.name)

/opt/conda/lib/python3.8/tempfile.py in _rmtree(cls, name)
    811                 raise
    812 
--> 813         _shutil.rmtree(name, onerror=onerror)
    814 
    815     @classmethod

/opt/conda/lib/python3.8/shutil.py in rmtree(path, ignore_errors, onerror)
    717                     os.rmdir(path)
    718                 except OSError:
--> 719                     onerror(os.rmdir, path, sys.exc_info())
    720             else:
    721                 try:

/opt/conda/lib/python3.8/shutil.py in rmtree(path, ignore_errors, onerror)
    715                 _rmtree_safe_fd(fd, path, onerror)
    716                 try:
--> 717                     os.rmdir(path)
    718                 except OSError:
    719                     onerror(os.rmdir, path, sys.exc_info())

OSError: [Errno 39] Directory not empty: '/data/shared/tmp/tmpg00q7u62'

Contents of /data/shared/tmp/tmpg00q7u62:

jihh@:auto-sklearn$> ls -al /data/shared/tmp/tmpg00q7u62
total 1604
drwx------    2 jihh   mlp-discovery-users      0 Nov 13 04:48 .
drwxrwx--T 4135 nobody mlp-discovery-users 212765 Nov 13 04:48 ..

Contents of /users/jihh/automl/auto-sklearn/temp_housing/:

jihh@:auto-sklearn$> ls -al /users/jihh/automl/auto-sklearn/temp_housing/
total 88
drwxr-xr-x 3 jihh mlp-discovery-users    95 Nov 13 04:48  .
drwxr-xr-x 5 jihh mlp-discovery-users   179 Nov 13 04:49  ..
-rw-r--r-- 1 jihh mlp-discovery-users 15546 Nov 13 04:48 'AutoML(1):8ae1121ed217904c992ab3815468796a.log'
drwxr-xr-x 3 jihh mlp-discovery-users   128 Nov 13 04:48  .auto-sklearn

To Reproduce

Running the notebook in a jupyterlab environment.

Expected behavior

Expect it wouldn't try to manage directories that it doesn't need to.

Actual behavior, stacktrace or logfile

auto_sklearn.log

Environment and installation:

Please give details about your installation:

Jupyterlab running a version of the DataScience Notebook image. See auto_sklearn.log for version information.

mfeurer commented 3 years ago

Thanks a lot @CrosbyMonk for reporting this issue. It appears that the directory that is tried to be deleted is a directory created by the pynisher for storing the output of a subprocess. Therefore, it is to be expected that Auto-sklearn tries to delete it.

However, there's now the question why the temporary directory is not empty? Are you still able to see the content of that directory and the files in there? Maybe the cleanup and the join need to be switched (https://github.com/automl/pynisher/blob/master/pynisher/limit_function_call.py#L281)?

CrosbyMonk commented 3 years ago

The directory was empty from the time the process died. See the above output from doing an ls on /data/shared/tmp/tmpg00q7u62.

mfeurer commented 3 years ago

Hey, are you able to constantly reproduce this or did this only happen a single time?

CrosbyMonk commented 3 years ago

Apparently missed your comment. 100% reproducible for me. Data science notebook base is jupyter/datascience-notebook:7e07b801d92b with the following additional packages installed.

docker
RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
    less \
    apt-transport-https \
    apt-utils \
    build-essential \
    curl \
    freeglut3-dev \
    gdebi-core \
    git \
    graphviz \
    krb5-config \
    krb5-user \
    libclang-dev \
    libcurl4-openssl-dev \
    libedit2 \
    libnlopt-dev \
    libsasl2-dev \
    libsasl2-modules-gssapi-mit \
    libspatialindex-dev \
    libkrb5-dev \
    libssl1.1 \
    libssl-dev \
    libxml2-dev \
    netcat \
    net-tools \
    openssh-server \
    psmisc \
    rsync \
    sf-dpl \
    vim \
    tesseract-ocr-all \
    xvfb \
    && apt upgrade -y \
    && apt-get autoclean \
    && apt-get clean \
    && apt-get autoremove -y 

and

docker
RUN python3 -m pip --no-cache-dir install --upgrade \
    bs4 \
    cloudpickle \
    configparser \
    cython \
    flask \
    graphviz \
    impyla \
    ipywidgets \
    kerberos \
    matplotlib \
    numpy \
    pandas \
    pandasql \
    pytest \
    sasl \
    scikit-learn \
    scipy \
    setuptools \
    thrift \
    thrift_sasl==0.2.1
github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs for the next 7 days. Thank you for your contributions.